Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (514)

Search Parameters:
Keywords = depth-wise separable convolutions

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
26 pages, 6872 KB  
Article
DSEPGAN: A Dual-Stream Enhanced Pyramid Based on Generative Adversarial Network for Spatiotemporal Image Fusion
by Dandan Zhou, Lina Xu, Ke Wu, Huize Liu and Mengting Jiang
Remote Sens. 2025, 17(24), 4050; https://doi.org/10.3390/rs17244050 - 17 Dec 2025
Abstract
Many deep learning-based spatiotemporal fusion (STF) methods have been proven to achieve high accuracy and robustness. Due to the variable shapes and sizes of objects in remote sensing images, pyramid networks are generally introduced to extract multi-scale features. However, the down-sampling operation in [...] Read more.
Many deep learning-based spatiotemporal fusion (STF) methods have been proven to achieve high accuracy and robustness. Due to the variable shapes and sizes of objects in remote sensing images, pyramid networks are generally introduced to extract multi-scale features. However, the down-sampling operation in the pyramid structure may lead to the loss of image detail information, affecting the model’s ability to reconstruct fine-grained targets. To address this issue, we propose a novel Dual-Stream Enhanced Pyramid based on Generative Adversarial Network (DSEPGAN) for the spatiotemporal fusion of remote sensing images. The network adopts a dual-stream architecture to separately process coarse and fine images, tailoring feature extraction to their respective characteristics: coarse images provide temporal dynamics, while fine images contain rich spatial details. A reversible feature transformation is embedded in the pyramid feature extraction stage to preserve high-frequency information, and a fusion module employing large-kernel and depthwise separable convolutions captures long-range dependencies across inputs. To further enhance realism and detail fidelity, adversarial training encourages the network to generate sharper and more visually convincing fusion results. The proposed DSEPGAN is compared with widely used and state-of-the-art STF models in three publicly available datasets. The results illustrate that DSEPGAN achieves superior performance across various evaluation metrics, highlighting its notable advantages for predicting seasonal variations in highly heterogeneous regions and abrupt changes in land use. Full article
20 pages, 4529 KB  
Article
Intelligent Recognition of Muffled Blasting Sounds and Lithology Prediction in Coal Mines Based on RDGNet
by Gengxin Li, Hua Ding, Kai Wang, Xiaoqiang Zhang and Jiacheng Sun
Sensors 2025, 25(24), 7601; https://doi.org/10.3390/s25247601 - 15 Dec 2025
Viewed by 25
Abstract
In the Yangquan coal mining region, China, muffled blasting sounds commonly occur in mine surrounding rocks resulting from instantaneous energy release following the elastic deformation of overlying brittle rock layers; they are related to fracture development. Although these events rarely cause immediate hazards, [...] Read more.
In the Yangquan coal mining region, China, muffled blasting sounds commonly occur in mine surrounding rocks resulting from instantaneous energy release following the elastic deformation of overlying brittle rock layers; they are related to fracture development. Although these events rarely cause immediate hazards, their acoustic signatures contain critical information about cumulative rock damage. Currently, conventional monitoring of muffled blasting sounds and surrounding rock stability relies on microseismic systems and on-site sampling techniques. However, these methods exhibit low identification efficiency for muffled blasting events, poor real-time performance, and strong subjectivity arising from manual signal interpretation and empirical threshold setting. This article proposes retentive depthwise gated network (RDGNet). By combining retentive network sequence modeling, depthwise separable convolution, and a gated fusion mechanism, RDGNet enables multimodal feature extraction and the fusion of acoustic emission sequences and audio Mel spectrograms, supporting real-time muffled blasting sound recognition and lithology classification. Results confirm model robustness under noisy and multisource mixed-signal conditions (overall accuracy: 92.12%, area under the curve: 0.985, and Macro F1: 0.931). This work provides an efficient approach for intelligent monitoring of coal mine rock stability and can be extended to safety assessments in underground engineering, advancing the mining industry toward preventive management. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

26 pages, 2632 KB  
Article
CAGM-Seg: A Symmetry-Driven Lightweight Model for Small Object Detection in Multi-Scenario Remote Sensing
by Hao Yao, Yancang Li, Wenzhao Feng, Ji Zhu, Haiming Yan, Shijun Zhang and Hanfei Zhao
Symmetry 2025, 17(12), 2137; https://doi.org/10.3390/sym17122137 - 12 Dec 2025
Viewed by 224
Abstract
In order to address challenges in small object recognition for remote sensing imagery—including high model complexity, overfitting with small samples, and insufficient cross-scenario generalization—this study proposes CAGM-Seg, a lightweight recognition model integrating multi-attention mechanisms. The model systematically enhances the U-Net architecture: First, the [...] Read more.
In order to address challenges in small object recognition for remote sensing imagery—including high model complexity, overfitting with small samples, and insufficient cross-scenario generalization—this study proposes CAGM-Seg, a lightweight recognition model integrating multi-attention mechanisms. The model systematically enhances the U-Net architecture: First, the encoder adopts a pre-trained MobileNetV3-Large as the backbone network, incorporating a coordinate attention mechanism to strengthen spatial localization of min targets. Second, an attention gating module is introduced in skip connections to achieve adaptive fusion of cross-level features. Finally, the decoder fully employs depthwise separable convolutions to significantly reduce model parameters. This design embodies a symmetry-aware philosophy, which is reflected in two aspects: the structural symmetry between the encoder and decoder facilitates multi-scale feature fusion, while the coordinate attention mechanism performs symmetric decomposition of spatial context (i.e., along height and width directions) to enhance the perception of geometrically regular small targets. Regarding training strategy, a hybrid loss function combining Dice Loss and Focal Loss, coupled with the AdamW optimizer, effectively enhances the model’s sensitivity to small objects while suppressing overfitting. Experimental results on the Xingtai black and odorous water body identification task demonstrate that CAGM-Seg outperforms comparison models in key metrics including precision (97.85%), recall (98.08%), and intersection-over-union (96.01%). Specifically, its intersection-over-union surpassed SegNeXt by 11.24 percentage points and PIDNet by 8.55 percentage points; its F1 score exceeded SegFormer by 2.51 percentage points. Regarding model efficiency, CAGM-Seg features a total of 3.489 million parameters, with 517,000 trainable parameters—approximately 80% fewer than the baseline U-Net—achieving a favorable balance between recognition accuracy and computational efficiency. Further cross-task validation demonstrates the model’s robust cross-scenario adaptability: it achieves 82.77% intersection-over-union and 90.57% F1 score in landslide detection, while maintaining 87.72% precision and 86.48% F1 score in cloud detection. The main contribution of this work is the effective resolution of key challenges in few-shot remote sensing small-object recognition—notably inadequate feature extraction and limited model generalization—via the strategic integration of multi-level attention mechanisms within a lightweight architecture. The resulting model, CAGM-Seg, establishes an innovative technical framework for real-time image interpretation under edge-computing constraints, demonstrating strong potential for practical deployment in environmental monitoring and disaster early warning systems. Full article
Show Figures

Figure 1

22 pages, 1479 KB  
Article
VMPANet: Vision Mamba Skin Lesion Image Segmentation Model Based on Prompt and Attention Mechanism Fusion
by Zinuo Peng, Shuxian Liu and Chenhao Li
J. Imaging 2025, 11(12), 443; https://doi.org/10.3390/jimaging11120443 - 11 Dec 2025
Viewed by 143
Abstract
In the realm of medical image processing, the segmentation of dermatological lesions is a pivotal technique for the early detection of skin cancer. However, existing methods for segmenting images of skin lesions often encounter limitations when dealing with intricate boundaries and diverse lesion [...] Read more.
In the realm of medical image processing, the segmentation of dermatological lesions is a pivotal technique for the early detection of skin cancer. However, existing methods for segmenting images of skin lesions often encounter limitations when dealing with intricate boundaries and diverse lesion shapes. To address these challenges, we propose VMPANet, designed to accurately localize critical targets and capture edge structures. VMPANet employs an inverted pyramid convolution to extract multi-scale features while utilizing the visual Mamba module to capture long-range dependencies among image features. Additionally, we leverage previously extracted masks as cues to facilitate efficient feature propagation. Furthermore, VMPANet integrates parallel depthwise separable convolutions to enhance feature extraction and introduces innovative mechanisms for edge enhancement, spatial attention, and channel attention to adaptively extract edge information and complex spatial relationships. Notably, VMPANet refines a novel cross-attention mechanism, which effectively facilitates the interaction between deep semantic cues and shallow texture details, thereby generating comprehensive feature representations while reducing computational load and redundancy. We conducted comparative and ablation experiments on two public skin lesion datasets (ISIC2017 and ISIC2018). The results demonstrate that VMPANet outperforms existing mainstream methods. On the ISIC2017 dataset, its mIoU and DSC metrics are 1.38% and 0.83% higher than those of VM-Unet respectively; on the ISIC2018 dataset, these metrics are 1.10% and 0.67% higher than those of EMCAD, respectively. Moreover, VMPANet boasts a parameter count of only 0.383 M and a computational load of 1.159 GFLOPs. Full article
(This article belongs to the Section Medical Imaging)
Show Figures

Figure 1

26 pages, 5681 KB  
Article
Physiological Artifact Suppression in EEG Signals Using an Efficient Multi-Scale Depth-Wise Separable Convolution and Variational Attention Deep Learning Model for Improved Neurological Health Signal Quality
by Vandana Akshath Raj, Tejasvi Parupudi, Vishnumurthy Kedlaya K, Ananthakrishna Thalengala and Subramanya G. Nayak
Technologies 2025, 13(12), 578; https://doi.org/10.3390/technologies13120578 - 9 Dec 2025
Viewed by 217
Abstract
Artifacts remain a major challenge in electroencephalogram (EEG) recordings, often degrading the accuracy of clinical diagnosis, brain computer interface (BCI) systems, and cognitive research. Although recent deep learning approaches have advanced EEG denoising, most still struggle to model long-range dependencies, maintain computational efficiency, [...] Read more.
Artifacts remain a major challenge in electroencephalogram (EEG) recordings, often degrading the accuracy of clinical diagnosis, brain computer interface (BCI) systems, and cognitive research. Although recent deep learning approaches have advanced EEG denoising, most still struggle to model long-range dependencies, maintain computational efficiency, and generalize to unseen artifact types. To address these challenges, this study proposes MDSC-VA, an efficient denoising framework that integrates multi-scale (M) depth-wise separable convolution (DSConv), variational autoencoder-based (VAE) latent encoding, and a multi-head self-attention mechanism. This unified architecture effectively balances denoising accuracy and model complexity while enhancing generalization to unseen artifact types. Comprehensive evaluations on three open-source EEG datasets, including EEGdenoiseNet, a Motion Artifact Contaminated Multichannel EEG dataset, and the PhysioNet EEG Motor Movement/Imagery dataset, demonstrate that MDSC-VA consistently outperforms state-of-the-art methods, achieving a higher signal-to-noise ratio (SNR), lower relative root mean square error (RRMSE), and stronger correlation coefficient (CC) values. Moreover, the model preserved over 99% of the dominant neural frequency band power, validating its ability to retain physiologically relevant rhythms. These results highlight the potential of MDSC-VA for reliable clinical EEG interpretation, real-time BCI systems, and advancement towards sustainable healthcare technologies in line with SDG-3 (Good Health and Well-Being). Full article
Show Figures

Graphical abstract

16 pages, 1037 KB  
Article
Research on a Lightweight Recognition Model for Daily Cattle Behavior Toward Real-Time Monitoring
by Jianping Yao, Yong’an Zhang, Mei’an Li, Jia Li, Yanqiu Liu, Feilong Kang and Fan Liu
Vet. Sci. 2025, 12(12), 1166; https://doi.org/10.3390/vetsci12121166 - 8 Dec 2025
Viewed by 185
Abstract
Accurate monitoring of cattle behavioral time budgets is crucial for early disease detection and welfare assessment. Changes in durations of standing, lying, and eating are known to be early indicators of health issues such as lameness and metabolic disorders. To enable low-cost, non-invasive, [...] Read more.
Accurate monitoring of cattle behavioral time budgets is crucial for early disease detection and welfare assessment. Changes in durations of standing, lying, and eating are known to be early indicators of health issues such as lameness and metabolic disorders. To enable low-cost, non-invasive, and real-time monitoring, this study proposes a lightweight cattle behavior recognition method based on an improved YOLO11n architecture. The model enhances multi-scale feature integration through a generalized efficient layer aggregation network (GELAN), improves feature extraction via a multidimensional collaborative attention (MCA) mechanism, and achieves efficient cross-scale fusion using a bidirectional feature pyramid network (BiFPN). Depthwise separable convolution (DWConv) is incorporated to reduce computational load. Experimental results demonstrate high recognition accuracy, with mAP@0.5 values of 91.2%, 91.0%, and 93.9% for standing, lying, and eating, respectively. The model was subsequently compressed using a Layer-adaptive Magnitude-based Pruning (LAMP) algorithm, resulting in a final model of only 1.06 × 106 parameters, a computational cost of 6.3 GFLOPS, and a weight size of 2.4 MB, while retaining 90.7% mAP@0.5. This highly efficient system is suitable for deployment on resource-constrained edge devices, providing a practical tool for continuous cattle monitoring. It offers a viable pathway for farmers to adopt precision livestock farming practices, facilitating early health intervention and promoting animal welfare. Full article
Show Figures

Figure 1

20 pages, 3620 KB  
Article
EMS-UKAN: An Efficient KAN-Based Segmentation Network for Water Leakage Detection of Subway Tunnel Linings
by Meide He, Lei Tan, Xiaohui Yang, Fei Liu, Zhimin Zhao and Xiaochun Wu
Appl. Sci. 2025, 15(24), 12859; https://doi.org/10.3390/app152412859 - 5 Dec 2025
Viewed by 148
Abstract
Water leakage in subway tunnel linings poses significant risks to structural safety and long-term durability, making accurate and efficient leakage detection a critical task. Existing deep learning methods, such as UNet and its variants, often suffer from large parameter sizes and limited ability [...] Read more.
Water leakage in subway tunnel linings poses significant risks to structural safety and long-term durability, making accurate and efficient leakage detection a critical task. Existing deep learning methods, such as UNet and its variants, often suffer from large parameter sizes and limited ability to capture multi-scale features, which restrict their applicability in real-world tunnel inspection. To address these issues, we propose an Efficient Multi-Scale U-shaped KAN-based Segmentation Network (EMS-UKAN) for detecting water leakage in subway tunnel linings. To reduce computational cost and enable edge-device deployment, the backbone replaces conventional convolutional layers with depthwise separable convolutions, and an Edge-Enhanced Depthwise Separable Convolution Module (EEDM) is incorporated in the decoder to strengthen boundary representation. The PKAN Block is introduced in the bottleneck to enhance nonlinear feature representation and improve the modeling of complex relationships among latent features. In addition, an Adaptive Multi-Scale Feature Extraction Block (AMS Block) is embedded within early skip connections to capture both fine-grained and large-scale leakage features. Extensive experiments on the newly collected Tunnel Water Leakage (TWL) dataset demonstrate that EMS-UKAN outperforms classical models, achieving competitive segmentation performance. In addition, it effectively reduces computational complexity, providing a practical solution for real-world tunnel inspection. Full article
Show Figures

Figure 1

16 pages, 5826 KB  
Article
Multi-Scale Feature Fusion Convolutional Neural Network Fault Diagnosis Method for Rolling Bearings
by Wen Yang, Meijuan Hu, Xionglu Peng and Jianghong Yu
Processes 2025, 13(12), 3929; https://doi.org/10.3390/pr13123929 - 4 Dec 2025
Viewed by 262
Abstract
Fault diagnosis methods for rolling bearings are frequently constrained to the automatic extraction of single-scale features from raw vibration signals, overlooking crucial information embedded in data of other scales, which often results in unsatisfactory diagnostic outcomes. To address this, a lightweight neural network [...] Read more.
Fault diagnosis methods for rolling bearings are frequently constrained to the automatic extraction of single-scale features from raw vibration signals, overlooking crucial information embedded in data of other scales, which often results in unsatisfactory diagnostic outcomes. To address this, a lightweight neural network model is proposed, which incorporates an improved Inception module for multi-scale convolutional feature fusion. Initially, this model generates time–frequency maps via continuous wavelet transform. Subsequently, it integrates the Fused-conv and Mbconv modules from the EfficientNet V2 architecture with the Inception module to conduct multi-scale convolution on input features, thereby comprehensively capturing fault information of the bearing. Additionally, it substitutes traditional convolution with depthwise separable convolution to minimize training parameters and introduces an attention mechanism to emphasize significant features while diminishing less relevant ones, thereby enhancing the accuracy of bearing fault diagnosis. Experimental findings indicate that the proposed fault diagnosis model achieves an accuracy of 100% under single-load conditions and 96.2% under variable-load conditions, demonstrating its applicability across diverse data sets and robust generalization capabilities. Full article
(This article belongs to the Section Process Control and Monitoring)
Show Figures

Figure 1

20 pages, 3406 KB  
Article
Efficient and Interpretable ECG Abnormality Detection via a Lightweight DSCR-BiGRU-Attention Network with Demographic Fusion
by Kan Luo, Longying Huang, Haixin He, Yu Chen, Lu You, Siluo Chen, Jian Chen and Chengyu Liu
Mathematics 2025, 13(23), 3882; https://doi.org/10.3390/math13233882 - 3 Dec 2025
Viewed by 303
Abstract
Deep learning has advanced automated electrocardiogram (ECG) interpretation, yet many models are computationally expensive, opaque, and overlook demographic factors. We propose DBA-ASFNet, a lightweight network that combines depthwise-separable convolutional residual blocks with a BiGRU and an attention mechanism to extract rich spatiotemporal features [...] Read more.
Deep learning has advanced automated electrocardiogram (ECG) interpretation, yet many models are computationally expensive, opaque, and overlook demographic factors. We propose DBA-ASFNet, a lightweight network that combines depthwise-separable convolutional residual blocks with a BiGRU and an attention mechanism to extract rich spatiotemporal features from 12-lead ECGs while maintaining low computational requirements. The Age-and-Sex Fusion (ASF) module integrates demographic information without enlarging the model, enabling personalized predictions. On the PTB-XL and CPSC2018 datasets, DBA-ASFNet achieves competitive multi-label performance with only ~0.03 million parameters and ~6.43 MFLOPs per inference. Real-time testing on a Raspberry Pi 5 achieved an average inference latency of ~2 ms, supporting deployment on resource-limited devices. Shapley additive explanations (SHAP) analysis shows that the model focuses on clinically meaningful ECG patterns and appropriately incorporates demographic factors, enhancing transparency. These results suggest that DBA-ASFNet is suited for accurate, efficient, and interpretable ECG analysis. Full article
Show Figures

Figure 1

34 pages, 11986 KB  
Article
High-Speed Die Bond Quality Detection Using Lightweight Architecture DSGβSI-SECS-Yolov7-Tiny
by Bao Rong Chang, Hsiu-Fen Tsai and Wei-Shun Chang
Sensors 2025, 25(23), 7358; https://doi.org/10.3390/s25237358 - 3 Dec 2025
Viewed by 304
Abstract
The die bonding process significantly impacts the yield and quality of IC packaging, and its quality detection is also a critical image sensing technology. With the advancement of machine automation and increased operating speeds, the misclassification rate in die bond image inspection has [...] Read more.
The die bonding process significantly impacts the yield and quality of IC packaging, and its quality detection is also a critical image sensing technology. With the advancement of machine automation and increased operating speeds, the misclassification rate in die bond image inspection has also risen. Therefore, this study develops a high-speed intelligent vision inspection model that slightly improves classification accuracy and adapts to the operation of new-generation machines. Furthermore, by identifying the causes of die bonding defects, key process parameters can be adjusted in real time during production, thereby improving the yield of the die bonding process and substantially reducing manufacturing cost losses. Previously, we proposed a lightweight model named DSGβSI-YOLOv7-tiny, which integrates depthwise separable convolution, Ghost convolution, and a Sigmoid activation function with a learnable β parameter. This model enables real-time and efficient detection and prediction of die bond quality through image sensing. We further enhanced the previous model by incorporating an SE layer, ECA-Net, Coordinate Attention, and a Small Object Enhancer to accommodate the faster operation of new machines. This improvement resulted in a more lightweight architecture named DSGβSI-SECS-YOLOv7-tiny. Compared with the previous model, the proposed model achieves an increased inference speed of 294.1 FPS and a Precision of 99.1%. Full article
Show Figures

Figure 1

22 pages, 3756 KB  
Article
Browser-Based Multi-Cancer Classification Framework Using Depthwise Separable Convolutions for Precision Diagnostics
by Divine Sebukpor, Ikenna Odezuligbo, Maimuna Nagey, Michael Chukwuka, Oluwamayowa Akinsuyi and Blessing Ndubuisi
Diagnostics 2025, 15(23), 3066; https://doi.org/10.3390/diagnostics15233066 - 1 Dec 2025
Viewed by 361
Abstract
Background: Early and accurate cancer detection remains a critical challenge in global healthcare. Deep learning has shown strong diagnostic potential, yet widespread adoption is limited by dependence on high-performance hardware, centralized servers, and data-privacy risks. Methods: This study introduces a browser-based [...] Read more.
Background: Early and accurate cancer detection remains a critical challenge in global healthcare. Deep learning has shown strong diagnostic potential, yet widespread adoption is limited by dependence on high-performance hardware, centralized servers, and data-privacy risks. Methods: This study introduces a browser-based multi-cancer classification framework that performs real-time, client-side inference using TensorFlow.js—eliminating the need for external servers or specialized GPUs. The proposed model fine-tunes the Xception architecture, leveraging depthwise separable convolutions for efficient feature extraction, on a large multi-cancer dataset of over 130,000 histopathological and cytological images spanning 26 cancer types. It was benchmarked against VGG16, ResNet50, EfficientNet-B0, and Vision Transformer. Results: The model achieved a Top-1 accuracy of 99.85% and Top-5 accuracy of 100%, surpassing all comparators while maintaining lightweight computational requirements. Grad-CAM visualizations confirmed that predictions were guided by histopathologically relevant regions, reinforcing interpretability and clinical trust. Conclusions: This work represents the first fully browser-deployable, privacy-preserving deep learning framework for multi-cancer diagnosis, demonstrating that high-accuracy AI can be achieved without infrastructure overhead. It establishes a practical pathway for equitable, cost-effective global deployment of medical AI tools. Full article
(This article belongs to the Special Issue Artificial Intelligence-Driven Radiomics in Medical Diagnosis)
Show Figures

Figure 1

20 pages, 26260 KB  
Article
AFMNet: A Dual-Domain Collaborative Network with Frequency Prior Guidance for Low-Light Image Enhancement
by Qianqian An and Long Ma
Entropy 2025, 27(12), 1220; https://doi.org/10.3390/e27121220 - 1 Dec 2025
Viewed by 266
Abstract
Low-light image enhancement (LLIE) degradation arises from insufficient illumination, reflectance occlusion, and noise coupling, and it manifests in the frequency domain as suppressed amplitudes with relatively stable phases. To address the fact that pure spatial mappings struggle to balance brightness enhancement and detail [...] Read more.
Low-light image enhancement (LLIE) degradation arises from insufficient illumination, reflectance occlusion, and noise coupling, and it manifests in the frequency domain as suppressed amplitudes with relatively stable phases. To address the fact that pure spatial mappings struggle to balance brightness enhancement and detail fidelity, whereas pure frequency-domain processing lacks semantic modeling, we propose AFMNet—a dual-domain collaborative enhancement network guided by an information-theoretic frequency prior. This prior regularizes global illumination, while spatial branches restore local details. First, a Multi-Scale Amplitude Estimator (MSAE) adaptively generates fine-grained amplitude-modulation maps via multi-scale fusion, encouraging higher output entropy through adaptive spectral-energy redistribution. Next, a Dual-Branch Spectral–Spatial Attention (DBSSA) module—comprising a Frequency-Modulated Attention Block (FMAB) and a Scale-Variable Depth Attention Block (SVDAB)—is employed: FMAB injects the modulation map as a frequency-domain prior into the attention mechanism to conditionally modulate the amplitude of value features while keeping the phase unchanged, thereby helping to preserve structural information in the enhanced output; SVDAB uses multi-scale depthwise-separable convolutions with scale attention to produce adaptively enhanced spatial features. Finally, a Spectral-Gated Feed-Forward Network (SGFFN) applies learnable spectral filters to local features for band-wise selective enhancement. This collaborative design achieves a favorable balance between illumination correction and detail preservation, and AFMNet delivers state-of-the-art performance on multiple low-light enhancement benchmarks. Full article
Show Figures

Figure 1

25 pages, 8066 KB  
Article
Estimation of All-Weather Daily Surface Net Radiation over the Tibetan Plateau Using an Optimized CNN Model
by Bin Ma, Yaoming Ma and Weiqiang Ma
Remote Sens. 2025, 17(23), 3894; https://doi.org/10.3390/rs17233894 - 30 Nov 2025
Viewed by 317
Abstract
Accurate daily surface net radiation (Rn) estimation over the Tibetan Plateau’s complex and highly heterogeneous terrain is essential for advancing the understanding of land–atmosphere exchanges and regional climate processes. This study developed an optimized deep learning framework that systematically evaluates 19 [...] Read more.
Accurate daily surface net radiation (Rn) estimation over the Tibetan Plateau’s complex and highly heterogeneous terrain is essential for advancing the understanding of land–atmosphere exchanges and regional climate processes. This study developed an optimized deep learning framework that systematically evaluates 19 CNN architectures using a per-pixel multivariate regression design (1 × 1 × 21). The channel-rich representation incorporates engineered neighborhood descriptors to statistically embed spatial context while fully avoiding the mosaic and boundary artifacts common in patch-based approaches. Among all tested networks, Xception delivered the best combination of accuracy (R2 > 0.94), computational efficiency, and physical consistency. Its depthwise separable convolutions and skip connections enable hierarchical nonlinear cross-channel feature learning, effectively capturing the complex dependencies between surface variables and Rn. Independent validation confirmed stable performance under diverse weather conditions and substantially better skill than GLASS, especially across rugged terrain and high-albedo surfaces. SHAP analysis further highlights physically meaningful behavior, with astronomical and topographic factors contributing ~70% and surface properties ~25% to predictions. Remaining challenges include dependence on continuous high-quality multi-source inputs and scale effects from mixed pixels. Future work will enhance operational deployment through automated daily preprocessing, improved sub-diurnal characterization via multi-scale data fusion, and stronger physical constraints to increase reliability. Full article
(This article belongs to the Section Atmospheric Remote Sensing)
Show Figures

Figure 1

21 pages, 1819 KB  
Article
MobileNetV3–Transformer-Based Prediction of Highway Accident Severity
by Liang Chen, Jia Wei, Guoqing Wang, Xiaoxiao Yang and Lusheng Qin
Appl. Sci. 2025, 15(23), 12694; https://doi.org/10.3390/app152312694 - 30 Nov 2025
Viewed by 295
Abstract
Traffic accidents on highway are often characterized by high destructiveness and severe casualties. Predicting accident severity and understanding its causes are crucial for enhancing highway safety. To address the issues of limited prediction accuracy and poor interpretability of traditional machine learning and deep [...] Read more.
Traffic accidents on highway are often characterized by high destructiveness and severe casualties. Predicting accident severity and understanding its causes are crucial for enhancing highway safety. To address the issues of limited prediction accuracy and poor interpretability of traditional machine learning and deep learning methods at the current stage, this study proposes an accident severity prediction model based on a hybrid architecture of MobileNetV3 and a Transformer. The model first encodes numerical accident-related variables into two-dimensional images using the Gramian Angular Field (GAF) method. Local spatial features are then extracted via the depthwise separable convolution modules of MobileNetV3, and long-range temporal dependencies are captured through the Transformer encoder, which outputs the final prediction. The proposed model is compared with Convolutional Neural Networks (CNNs), Long Short-Term Memory networks (LSTMs), MobileNetV3, a Transformer, and LSTM–Transformer architectures in terms of prediction performance. Results show that the MobileNetV3–Transformer model achieves the highest accuracy of 0.9549. Finally, the DeepSHAP interpretability algorithm is introduced to reveal the systemic influence and contribution of significant factors to accident severity. The results indicate that vehicle age, special road conditions, speed limits, and lighting conditions are closely related to the severity of highway accidents. This study provides a reliable theoretical basis for early warning of highway accidents and refines control measures to further enhance highway safety. Full article
Show Figures

Figure 1

19 pages, 1415 KB  
Article
LFRE-YOLO: Lightweight Edge Computing Algorithm for Detecting External-Damage Objects on Transmission Lines
by Min Liu, Benhui Wu and Ming Chen
Information 2025, 16(12), 1035; https://doi.org/10.3390/info16121035 - 27 Nov 2025
Viewed by 311
Abstract
Transmission lines in complex outdoor environments often suffer external damage in construction areas, severely affecting the stability of power systems. Traditional manual detection methods have problems of low efficiency and poor real-time performance. In deep learning-based detection methods, standard convolution has a large [...] Read more.
Transmission lines in complex outdoor environments often suffer external damage in construction areas, severely affecting the stability of power systems. Traditional manual detection methods have problems of low efficiency and poor real-time performance. In deep learning-based detection methods, standard convolution has a large parameter count and computational complexity, making it difficult to deploy on edge devices; while lightweight depthwise separable convolution offers low computational cost, it suffers from insufficient feature extraction capability. This limitation stems from its independent processing of each channel’s information, making it unable to simultaneously meet the practical requirements for both lightweight design and high detection accuracy in transmission line monitoring applications. To address the above problems, this study proposes LFRE-YOLO, a lightweight external damage detection algorithm for transmission lines based on YOLOv10n. This study proposes LFRE-YOLO, a lightweight external damage detection algorithm based on YOLOv10n. First, we design a lightweight feature reuse and enhancement convolution (LFREConv) that overcomes the limitations of traditional depthwise separable convolution through cascaded dual depthwise convolution structure and residual connection mechanisms, significantly expanding the effective receptive field with minimal parameter increment and compensating for information loss caused by independent channel processing in depthwise convolution through feature reuse strategies. Second, based on LFREConv, we propose an efficient lightweight feature extraction module (LFREBlock) that achieves cross-channel information interaction enhancement and channel importance modeling. Additionally, we propose a lightweight feature reuse and enhancement detection head (LFRE-Head) that applies LFREConv to the regression branch, achieving comprehensive lightweight design of the detection head while maintaining spatial localization accuracy. Finally, we employ layer-adaptive magnitude-based pruning (LAMP) to prune the trained model, further optimizing the network structure through layer-wise adaptive pruning. Experimental results demonstrate significant improvements over YOLOv10n baseline: mAP50 increased from 92.0% to 94.1%, mAP50-95 improved from 66.2% to 70.2%, while reducing parameters from 2.27 M to 0.99 M, computational complexity from 6.5 G to 3.1 G, and achieving 86.9 FPS inference speed, making it suitable for resource-constrained edge computing environments. Full article
(This article belongs to the Special Issue AI-Based Image Processing and Computer Vision)
Show Figures

Graphical abstract

Back to TopTop