Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (242)

Search Parameters:
Keywords = vanishing gradients

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 6041 KiB  
Article
Attention-Guided Residual Spatiotemporal Network with Label Regularization for Fault Diagnosis with Small Samples
by Yanlong Xu, Liming Zhang, Ling Chen, Tian Tan, Xiaolong Wang and Hongguang Xiao
Sensors 2025, 25(15), 4772; https://doi.org/10.3390/s25154772 - 3 Aug 2025
Viewed by 221
Abstract
Fault diagnosis is of great significance for the maintenance of rotating machinery. Deep learning is an intelligent diagnostic technique that is receiving increasing attention. To address the issues of industrial data with small samples and varying working conditions, a residual convolutional neural network [...] Read more.
Fault diagnosis is of great significance for the maintenance of rotating machinery. Deep learning is an intelligent diagnostic technique that is receiving increasing attention. To address the issues of industrial data with small samples and varying working conditions, a residual convolutional neural network based on the attention mechanism is put forward for the fault diagnosis of rotating machinery. The method incorporates channel attention and spatial attention simultaneously, implementing channel-wise recalibration for frequency-dependent feature adjustment and performing spatial context aggregation across receptive fields. Subsequently, a residual module is introduced to address the vanishing gradient problem of the model in deep network structures. In addition, LSTM is used to realize spatiotemporal feature fusion. Finally, label smoothing regularization (LSR) is proposed to balance the distributional disparities among labeled samples. The effectiveness of the method is evaluated by its application to the vibration signal data from the safe injection pump and the Case Western Reserve University (CWRU). The results show that the method has superb diagnostic accuracy and strong robustness. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

25 pages, 2515 KiB  
Article
Solar Agro Savior: Smart Agricultural Monitoring Using Drones and Deep Learning Techniques
by Manu Mundappat Ramachandran, Bisni Fahad Mon, Mohammad Hayajneh, Najah Abu Ali and Elarbi Badidi
Agriculture 2025, 15(15), 1656; https://doi.org/10.3390/agriculture15151656 - 1 Aug 2025
Viewed by 295
Abstract
The Solar Agro Savior (SAS) is an innovative solution that is assisted by drones for the sustainable utilization of water and plant disease observation in the agriculture sector. This system integrates an alerting mechanism for humidity, moisture, and temperature variations, which affect the [...] Read more.
The Solar Agro Savior (SAS) is an innovative solution that is assisted by drones for the sustainable utilization of water and plant disease observation in the agriculture sector. This system integrates an alerting mechanism for humidity, moisture, and temperature variations, which affect the plants’ health and optimization in water utilization, which enhances plant yield productivity. A significant feature of the system is the efficient monitoring system in a larger region through drones’ high-resolution cameras, which enables real-time, efficient response and alerting for environmental fluctuations to the authorities. The machine learning algorithm, particularly recurrent neural networks, which is a pioneer with agriculture and pest control, is incorporated for intelligent monitoring systems. The proposed system incorporates a specialized form of a recurrent neural network, Long Short-Term Memory (LSTM), which effectively addresses the vanishing gradient problem. It also utilizes an attention-based mechanism that enables the model to assign meaningful weights to the most important parts of the data sequence. This algorithm not only enhances water utilization efficiency but also boosts plant yield and strengthens pest control mechanisms. This system also provides sustainability through the re-utilization of water and the elimination of electric energy through solar panel systems for powering the inbuilt irrigation system. A comparative analysis of variant algorithms in the agriculture sector with a machine learning approach was also illustrated, and the proposed system yielded 99% yield accuracy, a 97.8% precision value, 98.4% recall, and a 98.4% F1 score value. By encompassing solar irrigation and artificial intelligence-driven analysis, the proposed algorithm, Solar Argo Savior, established a sustainable framework in the latest agricultural sectors and promoted sustainability to protect our environment and community. Full article
(This article belongs to the Section Agricultural Technology)
Show Figures

Figure 1

29 pages, 36251 KiB  
Article
CCDR: Combining Channel-Wise Convolutional Local Perception, Detachable Self-Attention, and a Residual Feedforward Network for PolSAR Image Classification
by Jianlong Wang, Bingjie Zhang, Zhaozhao Xu, Haifeng Sima and Junding Sun
Remote Sens. 2025, 17(15), 2620; https://doi.org/10.3390/rs17152620 - 28 Jul 2025
Viewed by 232
Abstract
In the task of PolSAR image classification, effectively utilizing convolutional neural networks and vision transformer models with limited labeled data poses a critical challenge. This article proposes a novel method for PolSAR image classification that combines channel-wise convolutional local perception, detachable self-attention, and [...] Read more.
In the task of PolSAR image classification, effectively utilizing convolutional neural networks and vision transformer models with limited labeled data poses a critical challenge. This article proposes a novel method for PolSAR image classification that combines channel-wise convolutional local perception, detachable self-attention, and a residual feedforward network. Specifically, the proposed method comprises several key modules. In the channel-wise convolutional local perception module, channel-wise convolution operations enable accurate extraction of local features from different channels of PolSAR images. The local residual connections further enhance these extracted features, providing more discriminative information for subsequent processing. Additionally, the detachable self-attention mechanism plays a pivotal role: it facilitates effective interaction between local and global information, enabling the model to comprehensively perceive features across different scales, thereby improving classification accuracy and robustness. Subsequently, replacing the conventional feedforward network with a residual feedforward network that incorporates residual structures aids the model in better representing local features, further enhances the capability of cross-layer gradient propagation, and effectively alleviates the problem of vanishing gradients during the training of deep networks. In the final classification stage, two fully connected layers with dropout prevent overfitting, while softmax generates predictions. The proposed method was validated on the AIRSAR Flevoland, RADARSAT-2 San Francisco, and RADARSAT-2 Xi’an datasets. The experimental results demonstrate that the proposed method can attain a high level of classification performance even with a limited amount of labeled data, and the model is relatively stable. Furthermore, the proposed method has lower computational costs than comparative methods. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

21 pages, 4388 KiB  
Article
An Omni-Dimensional Dynamic Convolutional Network for Single-Image Super-Resolution Tasks
by Xi Chen, Ziang Wu, Weiping Zhang, Tingting Bi and Chunwei Tian
Mathematics 2025, 13(15), 2388; https://doi.org/10.3390/math13152388 - 25 Jul 2025
Viewed by 286
Abstract
The goal of single-image super-resolution (SISR) tasks is to generate high-definition images from low-quality inputs, with practical uses spanning healthcare diagnostics, aerial imaging, and surveillance systems. Although cnns have considerably improved image reconstruction quality, existing methods still face limitations, including inadequate restoration of [...] Read more.
The goal of single-image super-resolution (SISR) tasks is to generate high-definition images from low-quality inputs, with practical uses spanning healthcare diagnostics, aerial imaging, and surveillance systems. Although cnns have considerably improved image reconstruction quality, existing methods still face limitations, including inadequate restoration of high-frequency details, high computational complexity, and insufficient adaptability to complex scenes. To address these challenges, we propose an Omni-dimensional Dynamic Convolutional Network (ODConvNet) tailored for SISR tasks. Specifically, ODConvNet comprises four key components: a Feature Extraction Block (FEB) that captures low-level spatial features; an Omni-dimensional Dynamic Convolution Block (DCB), which utilizes a multidimensional attention mechanism to dynamically reweight convolution kernels across spatial, channel, and kernel dimensions, thereby enhancing feature expressiveness and context modeling; a Deep Feature Extraction Block (DFEB) that stacks multiple convolutional layers with residual connections to progressively extract and fuse high-level features; and a Reconstruction Block (RB) that employs subpixel convolution to upscale features and refine the final HR output. This mechanism significantly enhances feature extraction and effectively captures rich contextual information. Additionally, we employ an improved residual network structure combined with a refined Charbonnier loss function to alleviate gradient vanishing and exploding to enhance the robustness of model training. Extensive experiments conducted on widely used benchmark datasets, including DIV2K, Set5, Set14, B100, and Urban100, demonstrate that, compared with existing deep learning-based SR methods, our ODConvNet method improves Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM), and the visual quality of SR images is also improved. Ablation studies further validate the effectiveness and contribution of each component in our network. The proposed ODConvNet offers an effective, flexible, and efficient solution for the SISR task and provides promising directions for future research. Full article
Show Figures

Figure 1

24 pages, 3714 KiB  
Article
DTCMMA: Efficient Wind-Power Forecasting Based on Dimensional Transformation Combined with Multidimensional and Multiscale Convolutional Attention Mechanism
by Wenhan Song, Enguang Zuo, Junyu Zhu, Chen Chen, Cheng Chen, Ziwei Yan and Xiaoyi Lv
Sensors 2025, 25(15), 4530; https://doi.org/10.3390/s25154530 - 22 Jul 2025
Viewed by 275
Abstract
With the growing global demand for clean energy, the accuracy of wind-power forecasting plays a vital role in ensuring the stable operation of power systems. However, wind-power generation is significantly influenced by meteorological conditions and is characterized by high uncertainty and multiscale fluctuations. [...] Read more.
With the growing global demand for clean energy, the accuracy of wind-power forecasting plays a vital role in ensuring the stable operation of power systems. However, wind-power generation is significantly influenced by meteorological conditions and is characterized by high uncertainty and multiscale fluctuations. Traditional recurrent neural network (RNN) and long short-term memory (LSTM) models, although capable of handling sequential data, struggle with modeling long-term temporal dependencies due to the vanishing gradient problem; thus, they are now rarely used. Recently, Transformer models have made notable progress in sequence modeling compared to RNNs and LSTM models. Nevertheless, when dealing with long wind-power sequences, their quadratic computational complexity (O(L2)) leads to low efficiency, and their global attention mechanism often fails to capture local periodic features accurately, tending to overemphasize redundant information while overlooking key temporal patterns. To address these challenges, this paper proposes a wind-power forecasting method based on dimension-transformed collaborative multidimensional multiscale attention (DTCMMA). This method first employs fast Fourier transform (FFT) to automatically identify the main periodic components in wind-power data, reconstructing the one-dimensional time series as a two-dimensional spatiotemporal representation, thereby explicitly encoding periodic features. Based on this, a collaborative multidimensional multiscale attention (CMMA) mechanism is designed, which hierarchically integrates channel, spatial, and pixel attention to adaptively capture complex spatiotemporal dependencies. Considering the geometric characteristics of the reconstructed data, asymmetric convolution kernels are adopted to enhance feature extraction efficiency. Experiments on multiple wind-farm datasets and energy-related datasets demonstrate that DTCMMA outperforms mainstream methods such as Transformer, iTransformer, and TimeMixer in long-sequence forecasting tasks, achieving improvements in MSE performance by 34.22%, 2.57%, and 0.51%, respectively. The model’s training speed also surpasses that of the fastest baseline by 300%, significantly improving both prediction accuracy and computational efficiency. This provides an efficient and accurate solution for wind-power forecasting and contributes to the further development and application of wind energy in the global energy mix. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

21 pages, 1061 KiB  
Article
Local Streamline Pattern and Topological Index of an Isotropic Point in a 2D Velocity Field
by Jian Gao, Rong Wang, Hongping Ma and Wennan Zou
Mathematics 2025, 13(14), 2320; https://doi.org/10.3390/math13142320 - 21 Jul 2025
Viewed by 202
Abstract
In fluid mechanics, most studies on flow structure analysis are simply based on the velocity gradient, which only involves the linear part of the velocity field and does not focus on the isotropic point. In this paper, we are concerned with a general [...] Read more.
In fluid mechanics, most studies on flow structure analysis are simply based on the velocity gradient, which only involves the linear part of the velocity field and does not focus on the isotropic point. In this paper, we are concerned with a general polynomial velocity field with a nonzero linear part and study its streamline pattern around an isotropic point, i.e., the local streamline pattern (LSP). A complete classification of LSPs in two-dimensional (2D) velocity fields is established. By proposing a novel formulation of qualitative equivalence, namely, the invariance under spatiotemporal transformations, we first introduce the quasi-real Schur form to classify the linear part of velocity fields. Then, for a nonlinear velocity field, the topological type of its LSP is either completely determined by the linear part when the determinant of the velocity gradient at the isotropic point is nonzero or controlled by both linear and nonlinear parts when the determinant of the velocity gradient vanishes at the isotropic point. Four new topological types of LSPs through detailed sector analysis are identified. Finally, we propose a direct method for calculating the index of the isotropic point, which also serves as a fundamental topological property of LSPs. These results do challenge the conventional linear analysis paradigm that simply neglects the contribution of the nonlinear part of the velocity field to the streamline pattern. Full article
(This article belongs to the Section E4: Mathematical Physics)
Show Figures

Figure 1

26 pages, 7857 KiB  
Article
Investigation of an Efficient Multi-Class Cotton Leaf Disease Detection Algorithm That Leverages YOLOv11
by Fangyu Hu, Mairheba Abula, Di Wang, Xuan Li, Ning Yan, Qu Xie and Xuedong Zhang
Sensors 2025, 25(14), 4432; https://doi.org/10.3390/s25144432 - 16 Jul 2025
Viewed by 337
Abstract
Cotton leaf diseases can lead to substantial yield losses and economic burdens. Traditional detection methods are challenged by low accuracy and high labor costs. This research presents the ACURS-YOLO network, an advanced cotton leaf disease detection architecture developed on the foundation of YOLOv11. [...] Read more.
Cotton leaf diseases can lead to substantial yield losses and economic burdens. Traditional detection methods are challenged by low accuracy and high labor costs. This research presents the ACURS-YOLO network, an advanced cotton leaf disease detection architecture developed on the foundation of YOLOv11. By integrating a medical image segmentation model, it effectively tackles challenges including complex background interference, the missed detection of small targets, and restricted generalization ability. Specifically, the U-Net v2 module is embedded in the backbone network to boost the multi-scale feature extraction performance in YOLOv11. Meanwhile, the CBAM attention mechanism is integrated to emphasize critical disease-related features. To lower the computational complexity, the SPPF module is substituted with SimSPPF. The C3k2_RCM module is appended for long–range context modeling, and the ARelu activation function is employed to alleviate the vanishing gradient problem. A database comprising 3000 images covering six types of cotton leaf diseases was constructed, and data augmentation techniques were applied. The experimental results show that ACURS-YOLO attains impressive performance indicators, encompassing a mAP_0.5 value of 94.6%, a mAP_0.5:0.95 value of 83.4%, 95.5% accuracy, 89.3% recall, an F1 score of 92.3%, and a frame rate of 148 frames per second. It outperforms YOLOv11 and other conventional models with regard to both detection precision and overall functionality. Ablation tests additionally validate the efficacy of each component, affirming the framework’s advantage in addressing complex detection environments. This framework provides an efficient solution for the automated monitoring of cotton leaf diseases, advancing the development of smart sensors through improved detection accuracy and practical applicability. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

24 pages, 3601 KiB  
Article
Laser-Induced Breakdown Spectroscopy Quantitative Analysis Using a Bayesian Optimization-Based Tunable Softplus Backpropagation Neural Network
by Xuesen Xu, Shijia Luo, Xuchen Zhang, Weiming Xu, Rong Shu, Jianyu Wang, Xiangfeng Liu, Ping Li, Changheng Li and Luning Li
Remote Sens. 2025, 17(14), 2457; https://doi.org/10.3390/rs17142457 - 16 Jul 2025
Viewed by 307
Abstract
Laser-induced breakdown spectroscopy (LIBS) has played a critical role in Mars exploration missions, substantially contributing to the geochemical analysis of Martian surface substances. However, the complex nonlinearity of LIBS processes can considerably limit the quantification accuracy of conventional LIBS chemometric methods. Hence chemometrics [...] Read more.
Laser-induced breakdown spectroscopy (LIBS) has played a critical role in Mars exploration missions, substantially contributing to the geochemical analysis of Martian surface substances. However, the complex nonlinearity of LIBS processes can considerably limit the quantification accuracy of conventional LIBS chemometric methods. Hence chemometrics based on artificial neural network (ANN) algorithms have become increasingly popular in LIBS analysis due to their extraordinary ability in nonlinear feature modeling. The hidden layer activation functions are key to ANN model performance, yet common activation functions usually suffer from problems such as gradient vanishing (e.g., Sigmoid and Tanh) and dying neurons (e.g., ReLU). In this study, we propose a novel LIBS quantification method, named the Bayesian optimization-based tunable Softplus backpropagation neural network (BOTS-BPNN). Based on a dataset comprising 1800 LIBS spectra collected by a laboratory duplicate of the MarSCoDe instrument onboard the Zhurong Mars rover, we have revealed that a BPNN model adopting a tunable Softplus activation function can achieve higher prediction accuracy than BPNN models adopting other common activation functions if the tunable Softplus parameter β is properly selected. Moreover, the way to find the proper β value has also been investigated. We demonstrate that the Bayesian optimization method surpasses the traditional grid search method regarding both performance and efficiency. The BOTS-BPNN model also shows superior performance over other common machine learning models like random forest (RF). This work indicates the potential of BOTS-BPNN as an effective chemometric method for analyzing Mars in situ LIBS data and sheds light on the use of chemometrics for data analysis in future planetary explorations. Full article
Show Figures

Figure 1

24 pages, 2467 KiB  
Article
Laor Initialization: A New Weight Initialization Method for the Backpropagation of Deep Learning
by Laor Boongasame, Jirapond Muangprathub and Karanrat Thammarak
Big Data Cogn. Comput. 2025, 9(7), 181; https://doi.org/10.3390/bdcc9070181 - 7 Jul 2025
Viewed by 591
Abstract
This paper presents Laor Initialization, an innovative weight initialization technique for deep neural networks that utilizes forward-pass error feedback in conjunction with k-means clustering to optimize the initial weights. In contrast to traditional methods, Laor adopts a data-driven approach that enhances convergence’s stability [...] Read more.
This paper presents Laor Initialization, an innovative weight initialization technique for deep neural networks that utilizes forward-pass error feedback in conjunction with k-means clustering to optimize the initial weights. In contrast to traditional methods, Laor adopts a data-driven approach that enhances convergence’s stability and efficiency. The method was assessed using various datasets, including a gold price time series, MNIST, and CIFAR-10 across the CNN and LSTM architectures. The results indicate that the Laor Initialization achieved the lowest K-fold cross-validation RMSE (0.00686), surpassing Xavier, He, and Random. Laor demonstrated a high convergence success (final RMSE = 0.00822) and the narrowest interquartile range (IQR), indicating superior stability. Gradient analysis confirmed Laor’s robustness, achieving the lowest coefficients of variation (CV = 0.2230 for MNIST, 0.3448 for CIFAR-10, and 0.5997 for gold price) with zero vanishing layers in the CNNs. Laor achieved a 24% reduction in CPU training time for the Gold price data and the fastest runtime on MNIST (340.69 s), while maintaining efficiency on CIFAR-10 (317.30 s). It performed optimally with a batch size of 32 and a learning rate between 0.001 and 0.01. These findings establish Laor as a robust alternative to conventional methods, suitable for moderately deep architectures. Future research should focus on dynamic variance scaling and adaptive clustering. Full article
Show Figures

Figure 1

21 pages, 804 KiB  
Article
Spam Email Detection Using Long Short-Term Memory and Gated Recurrent Unit
by Samiullah Saleem, Zaheer Ul Islam, Syed Shabih Ul Hasan, Habib Akbar, Muhammad Faizan Khan and Syed Adil Ibrar
Appl. Sci. 2025, 15(13), 7407; https://doi.org/10.3390/app15137407 - 1 Jul 2025
Viewed by 538
Abstract
In today’s business environment, emails are essential across all sectors, including finance and academia. There are two main types of emails: ham (legitimate) and spam (unsolicited). Spam wastes consumers’ time and resources and poses risks to sensitive data, with volumes doubling daily. Current [...] Read more.
In today’s business environment, emails are essential across all sectors, including finance and academia. There are two main types of emails: ham (legitimate) and spam (unsolicited). Spam wastes consumers’ time and resources and poses risks to sensitive data, with volumes doubling daily. Current spam identification methods, such as Blocklist approaches and content-based techniques, have limitations, highlighting the need for more effective solutions. These constraints call for detailed and more accurate approaches, such as machine learning (ML) and deep learning (DL), for realistic detection of new scams. Emphasis has since been placed on the possibility that ML and DL technologies are present in detecting email spam. In this work, we have succeeded in developing a hybrid deep learning model, where Long Short-Term Memory (LSTM) and the Gated Recurrent Unit (GRU) are applied distinctly to identify spam email. Despite the fact that the other models have been applied independently (CNNs, LSTM, GRU, or ensemble machine learning classifier) in previous studies, the given research has provided a contribution to the existing body of literature since it has managed to combine the advantage of LSTM in capturing the long-term dependency and the effectiveness of GRU in terms of computational efficiency. In this hybridization, we have addressed key issues such as the vanishing gradient problem and outrageous resource consumption that are usually encountered in applying standalone deep learning. Moreover, our proposed model is superior regarding the detection accuracy (90%) and AUC (98.99%). Though Transformer-based models are significantly lighter and can be used in real-time applications, they require extensive computation resources. The proposed work presents a substantive and scalable foundation to spam detection that is technically and practically dissimilar to the familiar approaches due to the powerful preprocessing steps, including particular stop-word removal, TF-IDF vectorization, and model testing on large, real-world size dataset (Enron-Spam). Additionally, delays in the feature comparison technique within the model minimize false positives and false negatives. Full article
Show Figures

Figure 1

17 pages, 2768 KiB  
Article
An Accelerated Editing Method for Stress Signal on Combine Harvester Chassis Using Wavelet Transform
by Shengcao Huang, Zihan Yang, Zhenghe Song, Zhiwei Yu, Xiaobo Guo and Du Chen
Sensors 2025, 25(13), 4100; https://doi.org/10.3390/s25134100 - 30 Jun 2025
Viewed by 311
Abstract
This paper presents a load spectrum acceleration editing method based on wavelet transform. The principle of the method is to decompose the target signal using wavelet transform to obtain high-frequency wavelet components, which are classified and combined based on their frequency components for [...] Read more.
This paper presents a load spectrum acceleration editing method based on wavelet transform. The principle of the method is to decompose the target signal using wavelet transform to obtain high-frequency wavelet components, which are classified and combined based on their frequency components for accelerated editing. During the damage segment identification stage, a threshold selection method based on the pseudo-damage gradient of the segment identification results is proposed. An envelope-based damage identification method is used to extract high-damage segments from the original signal, which are then concatenated to form an accelerated signal. Using the stress signal on the chassis of a combine harvester as a case study, the effectiveness of various accelerated editing methods is compared, with a discussion on the selection of wavelet function parameters. The results indicate that, compared to the time-domain damage retention method and the traditional wavelet transform accelerated editing method, the proposed improvement enhances the acceleration effect of the time-domain signal by 7.76% and 15.92%, respectively. The accelerated signal is consistent with the original signal in terms of statistical parameters and power spectral density. Additionally, we also found that an appropriate selection of the wavelet function’s vanishing moment can further reduce the time-domain signal length of the accelerated result by 4.8%. This study can provide beneficial experiential references for load spectrum development in the accelerated durability testing of agricultural machinery. Full article
(This article belongs to the Section Smart Agriculture)
Show Figures

Figure 1

22 pages, 4478 KiB  
Article
Welding Image Data Augmentation Method Based on LRGAN Model
by Ying Wang, Zhe Dai, Qiang Zhang and Zihao Han
Appl. Sci. 2025, 15(12), 6923; https://doi.org/10.3390/app15126923 - 19 Jun 2025
Viewed by 373
Abstract
This study focuses on the data bottleneck issue in the training of deep learning models during the intelligent welding control process and proposes an improved model called LRGAN (loss reconstruction generative adversarial networks). First, a five-layer spectral normalization neural network was designed as [...] Read more.
This study focuses on the data bottleneck issue in the training of deep learning models during the intelligent welding control process and proposes an improved model called LRGAN (loss reconstruction generative adversarial networks). First, a five-layer spectral normalization neural network was designed as the discriminator of the model. By incorporating the least squares loss function, the gradients of the model parameters were constrained within a reasonable range, which not only accelerated the convergence process but also effectively limited drastic changes in model parameters, alleviating the vanishing gradient problem. Next, a nine-layer residual structure was introduced in the generator to optimize the training of deep networks, preventing the mode collapse issue caused by the increase in the number of layers. The final experimental results show that the proposed LRGAN model outperforms other generative models in terms of evaluation metrics such as peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Fréchet inception distance (FID). It provides an effective solution to the small sample problem in the intelligent welding control process. Full article
(This article belongs to the Section Robotics and Automation)
Show Figures

Figure 1

20 pages, 2171 KiB  
Article
CBAM-ResNet: A Lightweight ResNet Network Focusing on Time Domain Features for End-to-End Deepfake Speech Detection
by Yuezhou Wu, Hua Huang, Zhiri Li and Siling Zhang
Electronics 2025, 14(12), 2456; https://doi.org/10.3390/electronics14122456 - 17 Jun 2025
Viewed by 416
Abstract
With the rapid development of synthetic speech and deepfake technology, fake speech poses a severe challenge to voice authentication systems. Traditional detection methods generally rely on manual feature extraction, facing problems such as limited feature expression ability and insufficient cross-scenario generalization performance. To [...] Read more.
With the rapid development of synthetic speech and deepfake technology, fake speech poses a severe challenge to voice authentication systems. Traditional detection methods generally rely on manual feature extraction, facing problems such as limited feature expression ability and insufficient cross-scenario generalization performance. To this end, this paper proposes an improved ResNet network based on a Convolutional Block Attention Module (CBAM) for end-to-end fake speech detection. This method introduces channel attention and spatial attention mechanisms into the ResNet network structure to enhance the model’s attention to the temporal characteristics of speech, thereby improving the ability to distinguish between real and fake speech. The proposed model adopts an end-to-end training strategy, directly processes the original spectrogram input, uses the residual structure to alleviate the gradient vanishing problem in the deep network, and enhances the collaborative expression ability of local details and global context through the CBAM module. The experiment is conducted on the ASVspoof2019 LA dataset, and the equal error rate (EER) is used as the main evaluation indicator. The experimental results show that compared with traditional deepfake speech detection methods, the proposed model achieves better performance in indicators such as EER, verifying the effectiveness of the CBAM attention mechanism in forged speech detection. Full article
(This article belongs to the Special Issue Emerging Trends in Generative-AI Based Audio Processing)
Show Figures

Figure 1

21 pages, 3621 KiB  
Article
CSNet: A Remote Sensing Image Semantic Segmentation Network Based on Coordinate Attention and Skip Connections
by Jiahao Li, Hongguo Zhang, Liang Chen, Binbin He and Huaixin Chen
Remote Sens. 2025, 17(12), 2048; https://doi.org/10.3390/rs17122048 - 13 Jun 2025
Cited by 1 | Viewed by 517
Abstract
In recent years, the continuous development of deep learning has significantly advanced its application in the field of remote sensing. However, the semantic segmentation of high-resolution remote sensing images remains challenging due to the presence of multi-scale objects and intricate spatial details, often [...] Read more.
In recent years, the continuous development of deep learning has significantly advanced its application in the field of remote sensing. However, the semantic segmentation of high-resolution remote sensing images remains challenging due to the presence of multi-scale objects and intricate spatial details, often leading to the loss of critical information during segmentation. To address this issue and enable fast and accurate segmentation of remote sensing images, we made improvements based on SegNet and named the enhanced model CSNet. CSNet is built upon the SegNet architecture and incorporates a coordinate attention (CA) mechanism, which enables the network to focus on salient features and capture global spatial information, thereby improving segmentation accuracy and facilitating the recovery of spatial structures. Furthermore, skip connections are introduced between the encoder and decoder to directly transfer low-level features to the decoder. This promotes the fusion of semantic information at different levels, enhances the recovery of fine-grained details, and optimizes the gradient flow during training, effectively mitigating the vanishing gradient problem and improving training efficiency. Additionally, a hybrid loss function combining weighted cross-entropy and Dice loss is employed. To address the issue of class imbalance, several categories within the dataset are merged, and samples with an excessively high proportion of background pixels are removed. These strategies significantly enhance the segmentation performance, particularly for small-sample classes. Experimental results from the Five-Billion-Pixels dataset demonstrate that, while introducing only a modest increase in parameters compared to SegNet, CSNet achieves superior segmentation performance in terms of overall classification accuracy, boundary delineation, and detail preservation, outperforming established methods such as U-Net, FCN, DeepLabv3+, SegNet, ViT, HRNe and BiFormert. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

26 pages, 2599 KiB  
Article
IGWO-MALSTM: An Improved Grey Wolf-Optimized Hybrid LSTM with Multi-Head Attention for Financial Time Series Forecasting
by Mingfu Zhu, Haoran Qi and Panke Qin
Appl. Sci. 2025, 15(12), 6619; https://doi.org/10.3390/app15126619 - 12 Jun 2025
Viewed by 448
Abstract
In the domain of financial markets, deep learning techniques have emerged as a significant tool for the development of investment strategies. The present study investigates the potential of time series forecasting (TSF) in financial application scenarios, aiming to predict future spreads and inform [...] Read more.
In the domain of financial markets, deep learning techniques have emerged as a significant tool for the development of investment strategies. The present study investigates the potential of time series forecasting (TSF) in financial application scenarios, aiming to predict future spreads and inform investment decisions more effectively. However, the inherent nonlinearity and high volatility of financial time series pose significant challenges for accurate forecasting. To address these issues, this paper proposes the IGWO-MALSTM model, a hybrid framework that integrates Improved Grey Wolf Optimization (IGWO) for hyperparameter tuning and a multi-head attention (MA) mechanism to enhance long-term sequence modeling within the long short-term memory (LSTM) architecture. The IGWO algorithm improves population diversity during initialization using the Mersenne Twister, thereby enhancing the convergence speed and search capability of the optimizer. Simultaneously, the MA mechanism mitigates gradient vanishing and explosion problems, enabling the model to better capture long-range dependencies in financial sequences. Experimental results on real futures market data demonstrate that the proposed model reduces Mean Square Error (MSE) by up to 61.45% and Mean Absolute Error (MAE) by 44.53%, and increases the R2 score by 0.83% compared to existing benchmark models. These findings confirm that IGWO-MALSTM offers improved predictive accuracy and stability for financial time series forecasting tasks. Full article
Show Figures

Figure 1

Back to TopTop