Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (337)

Search Parameters:
Keywords = SE block

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
25 pages, 5142 KiB  
Article
Wheat Powdery Mildew Severity Classification Based on an Improved ResNet34 Model
by Meilin Li, Yufeng Guo, Wei Guo, Hongbo Qiao, Lei Shi, Yang Liu, Guang Zheng, Hui Zhang and Qiang Wang
Agriculture 2025, 15(15), 1580; https://doi.org/10.3390/agriculture15151580 - 23 Jul 2025
Abstract
Crop disease identification is a pivotal research area in smart agriculture, forming the foundation for disease mapping and targeted prevention strategies. Among the most prevalent global wheat diseases, powdery mildew—caused by fungal infection—poses a significant threat to crop yield and quality, making early [...] Read more.
Crop disease identification is a pivotal research area in smart agriculture, forming the foundation for disease mapping and targeted prevention strategies. Among the most prevalent global wheat diseases, powdery mildew—caused by fungal infection—poses a significant threat to crop yield and quality, making early and accurate detection crucial for effective management. In this study, we present QY-SE-MResNet34, a deep learning-based classification model that builds upon ResNet34 to perform multi-class classification of wheat leaf images and assess powdery mildew severity at the single-leaf level. The proposed methodology begins with dataset construction following the GBT 17980.22-2000 national standard for powdery mildew severity grading, resulting in a curated collection of 4248 wheat leaf images at the grain-filling stage across six severity levels. To enhance model performance, we integrated transfer learning with ResNet34, leveraging pretrained weights to improve feature extraction and accelerate convergence. Further refinements included embedding a Squeeze-and-Excitation (SE) block to strengthen feature representation while maintaining computational efficiency. The model architecture was also optimized by modifying the first convolutional layer (conv1)—replacing the original 7 × 7 kernel with a 3 × 3 kernel, adjusting the stride to 1, and setting padding to 1—to better capture fine-grained leaf textures and edge features. Subsequently, the optimal training strategy was determined through hyperparameter tuning experiments, and GrabCut-based background processing along with data augmentation were introduced to enhance model robustness. In addition, interpretability techniques such as channel masking and Grad-CAM were employed to visualize the model’s decision-making process. Experimental validation demonstrated that QY-SE-MResNet34 achieved an 89% classification accuracy, outperforming established models such as ResNet50, VGG16, and MobileNetV2 and surpassing the original ResNet34 by 11%. This study delivers a high-performance solution for single-leaf wheat powdery mildew severity assessment, offering practical value for intelligent disease monitoring and early warning systems in precision agriculture. Full article
Show Figures

Figure 1

15 pages, 4874 KiB  
Article
A Novel 3D Convolutional Neural Network-Based Deep Learning Model for Spatiotemporal Feature Mapping for Video Analysis: Feasibility Study for Gastrointestinal Endoscopic Video Classification
by Mrinal Kanti Dhar, Mou Deb, Poonguzhali Elangovan, Keerthy Gopalakrishnan, Divyanshi Sood, Avneet Kaur, Charmy Parikh, Swetha Rapolu, Gianeshwaree Alias Rachna Panjwani, Rabiah Aslam Ansari, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Scott A. Helgeson, Venkata S. Akshintala and Shivaram P. Arunachalam
J. Imaging 2025, 11(7), 243; https://doi.org/10.3390/jimaging11070243 - 18 Jul 2025
Viewed by 260
Abstract
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static [...] Read more.
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static images, overlooking critical temporal cues present in video data. To bridge this gap, a novel DL-based framework is proposed for spatiotemporal feature extraction from medical video sequences. As a feasibility use case, this study focuses on gastrointestinal (GI) endoscopic video classification. A 3D convolutional neural network (CNN) is developed to classify upper and lower GI endoscopic videos using the hyperKvasir dataset, which contains 314 lower and 60 upper GI videos. To address data imbalance, 60 matched pairs of videos are randomly selected across 20 experimental runs. Videos are resized to 224 × 224, and the 3D CNN captures spatiotemporal information. A 3D version of the parallel spatial and channel squeeze-and-excitation (P-scSE) is implemented, and a new block called the residual with parallel attention (RPA) block is proposed by combining P-scSE3D with a residual block. To reduce computational complexity, a (2 + 1)D convolution is used in place of full 3D convolution. The model achieves an average accuracy of 0.933, precision of 0.932, recall of 0.944, F1-score of 0.935, and AUC of 0.933. It is also observed that the integration of P-scSE3D increased the F1-score by 7%. This preliminary work opens avenues for exploring various GI endoscopic video-based prospective studies. Full article
Show Figures

Figure 1

20 pages, 5486 KiB  
Article
SE-TransUNet-Based Semantic Segmentation for Water Leakage Detection in Tunnel Secondary Linings Amid Complex Visual Backgrounds
by Renjie Song, Yimin Wu, Li Wan, Shuai Shao and Haiping Wu
Appl. Sci. 2025, 15(14), 7872; https://doi.org/10.3390/app15147872 - 14 Jul 2025
Viewed by 182
Abstract
Traditional manual inspection methods for tunnel lining leakage are subjective and inefficient, while existing models lack sufficient recognition accuracy in complex scenarios. An intelligent leakage identification model adaptable to complex backgrounds is therefore needed. To address these issues, a Vision Transformer (ViT) was [...] Read more.
Traditional manual inspection methods for tunnel lining leakage are subjective and inefficient, while existing models lack sufficient recognition accuracy in complex scenarios. An intelligent leakage identification model adaptable to complex backgrounds is therefore needed. To address these issues, a Vision Transformer (ViT) was integrated into the UNet architecture, forming an SE-TransUNet model by incorporating SE-Block modules at skip connections between the encoder-decoder and the ViT output. Using a hybrid leakage dataset partitioned by k-fold cross-validation, the roles of SE-Block and ViT modules were examined through ablation experiments, and the model’s attention mechanism for leakage features was analyzed via Score-CAM heatmaps. Results indicate: (1) SE-TransUNet achieved mean values of 0.8318 (IoU), 0.8304 (Dice), 0.9394 (Recall), 0.8480 (Precision), 0.9733 (AUC), 0.8562 (MCC), 0.9218 (F1-score), and 6.53 (FPS) on the hybrid dataset, demonstrating robust generalization in scenarios with dent shadows, stain interference, and faint leakage traces. (2) Ablation experiments confirmed both modules’ necessity: The baseline model’s IoU exceeded the variant without the SE module by 4.50% and the variant without both the SE and ViT modules by 7.04%. (3) Score-CAM heatmaps showed the SE module broadened the model’s attention coverage of leakage areas, enhanced feature continuity, and improved anti-interference capability in complex environments. This research may provide a reference for related fields. Full article
Show Figures

Figure 1

18 pages, 70320 KiB  
Article
RIS-UNet: A Multi-Level Hierarchical Framework for Liver Tumor Segmentation in CT Images
by Yuchai Wan, Lili Zhang and Murong Wang
Entropy 2025, 27(7), 735; https://doi.org/10.3390/e27070735 - 9 Jul 2025
Viewed by 341
Abstract
The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we [...] Read more.
The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we propose a novel multi-level hierarchical framework for liver tumor segmentation. In the first level, we integrate inter-slice spatial information by a 2.5D network to resolve the accuracy–efficiency trade-off inherent in conventional 2D/3D segmentation strategies for liver tumor segmentation. Then, the second level extracts the inner-slice global and local features for enhancing feature representation. We propose the Res-Inception-SE Block, which combines residual connections, multi-scale Inception modules, and squeeze-excitation attention to capture comprehensive global and local features. Furthermore, we design a hybrid loss function combining Binary Cross Entropy (BCE) and Dice loss to solve the category imbalance problem and accelerate convergence. Extensive experiments on the LiTS17 dataset demonstrate the effectiveness of our method on accuracy, efficiency, and visual results for liver tumor segmentation. Full article
(This article belongs to the Special Issue Cutting-Edge AI in Computational Bioinformatics)
Show Figures

Figure 1

21 pages, 3406 KiB  
Article
ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification
by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu
Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025
Viewed by 479
Abstract
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article
Show Figures

Figure 1

33 pages, 3352 KiB  
Article
Optimization Strategy for Underwater Target Recognition Based on Multi-Domain Feature Fusion and Deep Learning
by Yanyang Lu, Lichao Ding, Ming Chen, Danping Shi, Guohao Xie, Yuxin Zhang, Hongyan Jiang and Zhe Chen
J. Mar. Sci. Eng. 2025, 13(7), 1311; https://doi.org/10.3390/jmse13071311 - 7 Jul 2025
Viewed by 348
Abstract
Underwater sonar target recognition is crucial in fields such as national defense, navigation, and environmental monitoring. However, it faces issues such as the complex characteristics of ship-radiated noise, imbalanced data distribution, non-stationarity, and bottlenecks of existing technologies. This paper proposes the MultiFuseNet-AID network, [...] Read more.
Underwater sonar target recognition is crucial in fields such as national defense, navigation, and environmental monitoring. However, it faces issues such as the complex characteristics of ship-radiated noise, imbalanced data distribution, non-stationarity, and bottlenecks of existing technologies. This paper proposes the MultiFuseNet-AID network, aiming to address these challenges. The network includes the TriFusion block module, the novel lightweight attention residual network (NLARN), the long- and short-term attention (LSTA) module, and the Mamba module. Through the TriFusion block module, the original, differential, and cumulative signals are processed in parallel, and features such as MFCC, CQT, and Fbank are fused to achieve deep multi-domain feature fusion, thereby enhancing the signal representation ability. The NLARN was optimized based on the ResNet architecture, with the SE attention mechanism embedded. Combined with the long- and short-term attention (LSTA) and the Mamba module, it could capture long-sequence dependencies with an O(N) complexity, completing the optimization of lightweight long sequence modeling. At the same time, with the help of feature fusion, and layer normalization and residual connections of the Mamba module, the adaptability of the model in complex scenarios with imbalanced data and strong noise was enhanced. On the DeepShip and ShipsEar datasets, the recognition rates of this model reached 98.39% and 99.77%, respectively. The number of parameters and the number of floating point operations were significantly lower than those of classical models, and it showed good stability and generalization ability under different sample label ratios. The research shows that the MultiFuseNet-AID network effectively broke through the bottlenecks of existing technologies. However, there is still room for improvement in terms of adaptability to extreme underwater environments, training efficiency, and adaptability to ultra-small devices. It provides a new direction for the development of underwater sonar target recognition technology. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

20 pages, 1935 KiB  
Article
Residual Attention Network with Atrous Spatial Pyramid Pooling for Soil Element Estimation in LUCAS Hyperspectral Data
by Yun Deng, Yuchen Cao, Shouxue Chen and Xiaohui Cheng
Appl. Sci. 2025, 15(13), 7457; https://doi.org/10.3390/app15137457 - 3 Jul 2025
Viewed by 245
Abstract
Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address [...] Read more.
Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address these challenges, we propose ReSE-AP Net, a multi-scale attention residual network with spatial pyramid pooling. Built on convolutional residual blocks, the model incorporates a squeeze-and-excitation channel attention mechanism to recalibrate feature weights and an atrous spatial pyramid pooling (ASPP) module to extract multi-resolution spectral features. This architecture synergistically represents weak absorption peaks (400–1000 nm) and broad spectral bands (1000–2500 nm), overcoming single-scale modeling limitations. Validation on the LUCAS2009 dataset demonstrated that ReSE-AP Net outperformed conventional machine learning by improving the R2 by 2.8–36.5% and reducing the RMSE by 14.2–69.2%. Compared with existing deep learning methods, it increased the R2 by 0.4–25.5% for clay, silt, sand, organic carbon, calcium carbonate, and phosphorus predictions, and decreased the RMSE by 0.7–39.0%. Our contributions include statistical analysis of LUCAS2009 spectra, identification of conventional method limitations, development of the ReSE-AP Net model, ablation studies, and comprehensive comparisons with alternative approaches. Full article
Show Figures

Figure 1

20 pages, 760 KiB  
Article
Detecting AI-Generated Images Using a Hybrid ResNet-SE Attention Model
by Abhilash Reddy Gunukula, Himel Das Gupta and Victor S. Sheng
Appl. Sci. 2025, 15(13), 7421; https://doi.org/10.3390/app15137421 - 2 Jul 2025
Viewed by 314
Abstract
The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose [...] Read more.
The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose serious risks in terms of misinformation, digital forgery, and identity manipulation. This paper presents a novel hybrid deep learning model for detecting AI-generated images by integrating the ResNet-50 architecture with Squeeze-and-Excitation (SE) attention blocks. The proposed SE-ResNet50 model enhances channel-wise feature recalibration and interpretability by integrating Squeeze-and-Excitation (SE) blocks into the ResNet-50 backbone, enabling dynamic emphasis on subtle generative artifacts such as unnatural textures and semantic inconsistencies, thereby improving classification fidelity. Experimental evaluation on the CIFAKE dataset demonstrates the model’s effectiveness, achieving a test accuracy of 96.12%, precision of 97.04%, recall of 88.94%, F1-score of 92.82%, and an AUC score of 0.9862. The model shows strong generalization, minimal overfitting, and superior performance compared with transformer-based models and standard architectures like ResNet-50, VGGNet, and DenseNet. These results confirm the hybrid model’s suitability for real-time and resource-constrained applications in media forensics, content authentication, and ethical AI governance. Full article
(This article belongs to the Special Issue Advanced Signal and Image Processing for Applied Engineering)
Show Figures

Figure 1

15 pages, 1949 KiB  
Article
High-Performance and Lightweight AI Model with Integrated Self-Attention Layers for Soybean Pod Number Estimation
by Qian Huang
AI 2025, 6(7), 135; https://doi.org/10.3390/ai6070135 - 24 Jun 2025
Viewed by 435
Abstract
Background: Soybean is an important global crop in food security and agricultural economics. Accurate estimation of soybean pod counts is critical for yield prediction, breeding programs, precision farming, etc. Traditional methods, such as manual counting, are slow, labor-intensive, and prone to errors. With [...] Read more.
Background: Soybean is an important global crop in food security and agricultural economics. Accurate estimation of soybean pod counts is critical for yield prediction, breeding programs, precision farming, etc. Traditional methods, such as manual counting, are slow, labor-intensive, and prone to errors. With rapid advancements in artificial intelligence (AI), deep learning has enabled automatic pod number estimation in collaboration with unmanned aerial vehicles (UAVs). However, existing AI models are computationally demanding and require significant processing resources (e.g., memory). These resources are often not available in rural regions and small farms. Methods: To address these challenges, this study presents a set of lightweight, efficient AI models designed to overcome these limitations. By integrating model simplification, weight quantization, and squeeze-and-excitation (SE) self-attention blocks, we develop compact AI models capable of fast and accurate soybean pod count estimation. Results and Conclusions: Experimental results show a comparable estimation accuracy of 84–87%, while the AI model size is significantly reduced by a factor of 9–65, thus making them suitable for deployment in edge devices, such as Raspberry Pi. Compared to existing models such as YOLO POD and SoybeanNet, which rely on over 20 million parameters to achieve approximately 84% accuracy, our proposed lightweight models deliver a comparable or even higher accuracy (84.0–86.76%) while using fewer than 2 million parameters. In future work, we plan to expand the dataset by incorporating diverse soybean images to enhance model generalizability. Additionally, we aim to explore more advanced attention mechanisms—such as CBAM or ECA—to further improve feature extraction and model performance. Finally, we aim to implement the complete system in edge devices and conduct real-world testing in soybean fields. Full article
Show Figures

Figure 1

17 pages, 2526 KiB  
Article
The Effect of Selenium on Rice Quality Under Different Nitrogen Levels
by Yuqi Liu, Bingchun Yan, Ya Liu, Yuzhuo Liu, Liqiang Chen, Hongfang Jiang, Yingying Feng, Jiping Gao and Wenzhong Zhang
Agronomy 2025, 15(6), 1437; https://doi.org/10.3390/agronomy15061437 - 12 Jun 2025
Viewed by 551
Abstract
Selenium (Se) is a trace element that is beneficial in enhancing the quality of rice production. However, research on the effects of Se on rice quality under varying nitrogen (N) levels is limited and requires further investigation. This experiment utilized a randomized block [...] Read more.
Selenium (Se) is a trace element that is beneficial in enhancing the quality of rice production. However, research on the effects of Se on rice quality under varying nitrogen (N) levels is limited and requires further investigation. This experiment utilized a randomized block design, incorporating an N fertilizer reduction and efficient application mode, with two N levels, CN (225 kg·hm−2) and LN (180 kg·hm−2), and three Se levels, HSe (0.12 kg·hm−2), LSe (0.06 kg·hm−2), and 0Se (0.00 kg·hm−2). The results indicated that the effects of Se on rice processing quality differ under different N levels. Selenium adversely affected the processing quality under the CN level, whereas it demonstrated some improvement at the LN level. Furthermore, Se application increased the Se content in rice by 46.48–141.82% and enhanced the taste value by 14.88–22.73%. It significantly improved the nutritional and cooking qualities of rice and positively influenced its appearance. Although N levels induced variations, their overall impact remained beneficial. Considering various indicators, applying 0.06 kg·hm−2 of Na2SeO3 under the LN level yielded optimal results. This study provides valuable insights into the effects of Se on rice quality under different N levels. It provides a more scientific basis for the application of selenium fertilizer in rice. Full article
(This article belongs to the Section Soil and Plant Nutrition)
Show Figures

Figure 1

23 pages, 4555 KiB  
Article
Prediction of Medium-Thick Plates Weld Penetration States in Cold Metal Transfer Plus Pulse Welding Based on Deep Learning Model
by Yanli Song, Kang Song, Yipeng Peng, Lin Hua, Jue Lu and Xuanguo Wang
Metals 2025, 15(6), 637; https://doi.org/10.3390/met15060637 - 5 Jun 2025
Viewed by 442
Abstract
During the cold metal transfer plus pulse (CMT+P) welding process of medium-thick plates, problems such as incomplete penetration (IP) and burn-through (BT) are prone to occur, and weld pool morphology is important information reflecting the penetration states. In order to acquire high-quality weld [...] Read more.
During the cold metal transfer plus pulse (CMT+P) welding process of medium-thick plates, problems such as incomplete penetration (IP) and burn-through (BT) are prone to occur, and weld pool morphology is important information reflecting the penetration states. In order to acquire high-quality weld pool images under complex welding conditions, such as smoke and arc light, a welding monitoring system was designed. For the purpose of predicting weld penetration states, the improved Inception-ResNet prediction model was proposed. Squeeze-and-Excitation (SE) block was added after each Inception-ResNet block to further extract key feature information from weld pool images, increasing the weight of key features beneficial for predicting the penetration states. The model has been trained, validated, and tested. The results demonstrate that the improved model has an accuracy of over 96% in predicting penetration states of aluminum alloy medium-thick plates compared to the original model. The model was applied in welding experiments and achieved an accurate prediction. Full article
Show Figures

Graphical abstract

21 pages, 8812 KiB  
Article
A Three-Channel Improved SE Attention Mechanism Network Based on SVD for High-Order Signal Modulation Recognition
by Xujia Zhou, Gangyi Tu, Xicheng Zhu, Di Zhao and Luyan Zhang
Electronics 2025, 14(11), 2233; https://doi.org/10.3390/electronics14112233 - 30 May 2025
Viewed by 376
Abstract
To address the issues of poor differentiation capability for high-order signals and low average recognition rates in existing communication modulation recognition techniques, this paper first performs denoising using an entropy-based dynamic Singular Value Decomposition (SVD) method and proposes a three-channel convolutional gated recurrent [...] Read more.
To address the issues of poor differentiation capability for high-order signals and low average recognition rates in existing communication modulation recognition techniques, this paper first performs denoising using an entropy-based dynamic Singular Value Decomposition (SVD) method and proposes a three-channel convolutional gated recurrent units (GRU) model combined with an improved SE attention mechanism for automatic modulation recognition.The model denoises in-phase/quadrature (I/Q) signals using the SVD method to enhance signal quality. By combining one-dimensional (1D) convolutional and two-dimensional (2D) convolutional, it employs a three-channel approach to extract spatial features and capture local correlations. GRU is utilized to capture temporal sequence features so as to enhance the perception of dynamic changes. Additionally, an improved SE block is introduced to optimize feature representation, adaptively adjust channel weights, and improve classification performance. Experiments on the RadioML2016.10a dataset show that the model has a maximum classification recognition rate of 92.54%. Compared with traditional CNN, ResNet, CLDNN, GRU2, DAE, and LSTM2, the average recognition accuracy is improved by 5.41% to 8.93%. At the same time, the model significantly enhances the differentiation capability between 16QAM and 64QAM, reducing the average confusion probability by 27.70% to 39.40%. Full article
Show Figures

Figure 1

14 pages, 1196 KiB  
Article
Deep Learning Architectures for Single-Label and Multi-Label Surgical Tool Classification in Minimally Invasive Surgeries
by Hisham ElMoaqet, Hamzeh Qaddoura, Mutaz Ryalat, Natheer Almtireen, Tamer Abdulbaki Alshirbaji, Nour Aldeen Jalal, Thomas Neumuth and Knut Moeller
Appl. Sci. 2025, 15(11), 6121; https://doi.org/10.3390/app15116121 - 29 May 2025
Viewed by 400
Abstract
The integration of Context-Aware Systems (CASs) in Future Operating Rooms (FORs) aims to enhance surgical workflows and outcomes through real-time data analysis. CASs require accurate classification of surgical tools, enabling the understanding of surgical actions. This study proposes a novel deep learning approach [...] Read more.
The integration of Context-Aware Systems (CASs) in Future Operating Rooms (FORs) aims to enhance surgical workflows and outcomes through real-time data analysis. CASs require accurate classification of surgical tools, enabling the understanding of surgical actions. This study proposes a novel deep learning approach for surgical tool classification based on combining convolutional neural networks (CNNs), Feature Fusion Modules (FFMs), Squeeze-and-Excitation (SE) networks, and Bidirectional long-short term memory (BiLSTM) networks to capture both spatial and temporal features in laparoscopic surgical videos. We explored different modeling scenarios with respect to the location and number of SE blocks for multi-label surgical tool classification in the Cholec80 dataset. Furthermore, we analyzed a single-label surgical tool classification model using a simplified and computationally less expensive architecture compared to the multi-label problem setting. The single-label classification model showed an improved overall performance compared to the proposed multi-label classification model due to the increased complexity of identifying multiple tools simultaneously. Nonetheless, our results demonstrated that the proposed CNN-SE-FFM-BiLSTM multi-label model achieved competitive performance to state-of-the-art methods with excellent performance in detecting tools with complex usage patterns and in minority classes. Future work should focus on optimizing models for real-time applications, and broadening dataset evaluations to improve performance in diverse surgical environments. These improvements are crucial for the practical implementation of such models in CASs, ultimately aiming to enhance surgical workflows and patient outcomes in FORs. Full article
Show Figures

Figure 1

23 pages, 5084 KiB  
Article
A Hybrid Dropout Method for High-Precision Seafloor Topography Reconstruction and Uncertainty Quantification
by Xinye Cui, Houpu Li, Yanting Yu, Shaofeng Bian and Guojun Zhai
Appl. Sci. 2025, 15(11), 6113; https://doi.org/10.3390/app15116113 - 29 May 2025
Viewed by 321
Abstract
Seafloor topography super-resolution reconstruction is critical for marine resource exploration, geological monitoring, and navigation safety. However, sparse acoustic data frequently result in the loss of high-frequency details, and traditional deep learning models exhibit limitations in uncertainty quantification, impeding their practical application. To address [...] Read more.
Seafloor topography super-resolution reconstruction is critical for marine resource exploration, geological monitoring, and navigation safety. However, sparse acoustic data frequently result in the loss of high-frequency details, and traditional deep learning models exhibit limitations in uncertainty quantification, impeding their practical application. To address these challenges, this study systematically investigates the combined effects of various regularization strategies and uncertainty quantification modules. It proposes a hybrid dropout model that jointly optimizes high-precision reconstruction and uncertainty estimation. The model integrates residual blocks, squeeze-and-excitation (SE) modules, and a multi-scale feature extraction network while employing Monte Carlo Dropout (MC-Dropout) alongside heteroscedastic noise modeling to dynamically gate the uncertainty quantification process. By adaptively modulating the regularization strength based on feature activations, the model preserves high-frequency information and accurately estimates predictive uncertainty. The experimental results demonstrate significant improvements in the Root Mean Square Error (RMSE), Mean Absolute Error (MAE), and Peak Signal-to-Noise Ratio (PSNR). Compared to conventional dropout architectures, the proposed method achieves a PSNR increase of 46.5% to 60.5% in test regions with a marked reduction in artifacts. Overall, the synergistic effect of employed regularization strategies and uncertainty quantification modules substantially enhances detail recovery and robustness in complex seafloor topography reconstruction, offering valuable theoretical insights and practical guidance for further optimization of deep learning models in challenging applications. Full article
(This article belongs to the Section Marine Science and Engineering)
Show Figures

Figure 1

15 pages, 9455 KiB  
Article
Substation Equipment Defect Detection Based on Improved YOLOv8
by Yiwei Sun, Xiangran Sun, Ying Lin, Yi Yang, Zhuangzhuang Li, Lun Du and Chaojun Shi
Sensors 2025, 25(11), 3410; https://doi.org/10.3390/s25113410 - 28 May 2025
Viewed by 498
Abstract
The detection of equipment defects in substations is crucial for maintaining the normal operation of power systems. This paper proposes an object detection algorithm for substation equipment defect detection based on improvements to the YOLOv8 model. First, the backbone of YOLOv8 is replaced [...] Read more.
The detection of equipment defects in substations is crucial for maintaining the normal operation of power systems. This paper proposes an object detection algorithm for substation equipment defect detection based on improvements to the YOLOv8 model. First, the backbone of YOLOv8 is replaced with EfficientViT, which not only reduces computational redundancy but also enhances the model’s feature extraction capabilities, thereby improving overall performance. Second, a Squeeze-and-Excitation (SE) attention mechanism module is incorporated at the terminal stage of the backbone network to reinforce channel-wise feature representation in input feature maps. Finally, the Bottleneck component within YOLOv8’s C2f module is substituted with FasterBlock, which significantly accelerates inference speed while maintaining model accuracy. Experimental results on the substation equipment defect dataset demonstrate that the improved algorithm achieves a mean average precision (mAP) of 92.8%, representing a 1.8% enhancement over the baseline model. The substantial improvement in average precision confirms the feasibility and effectiveness of the proposed modifications to the YOLOv8 architecture. Full article
(This article belongs to the Special Issue Diagnosis and Risk Analysis of Electrical Systems)
Show Figures

Figure 1

Back to TopTop