Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (377)

Search Parameters:
Keywords = squeeze-and-excitation networks

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 3087 KiB  
Article
Photoplethysmogram (PPG)-Based Biometric Identification Using 2D Signal Transformation and Multi-Scale Feature Fusion
by Yuanyuan Xu, Zhi Wang and Xiaochang Liu
Sensors 2025, 25(15), 4849; https://doi.org/10.3390/s25154849 - 7 Aug 2025
Abstract
Using Photoplethysmogram (PPG) signals for identity recognition has been proven effective in biometric authentication. However, in real-world applications, PPG signals are prone to interference from noise, physical activity, diseases, and other factors, making it challenging to ensure accurate user recognition and verification in [...] Read more.
Using Photoplethysmogram (PPG) signals for identity recognition has been proven effective in biometric authentication. However, in real-world applications, PPG signals are prone to interference from noise, physical activity, diseases, and other factors, making it challenging to ensure accurate user recognition and verification in complex environments. To address these issues, this paper proposes an improved MSF-SE ResNet50 (Multi-Scale Feature Squeeze-and-Excitation ResNet50) model based on 2D PPG signals. Unlike most existing methods that directly process one-dimensional PPG signals, this paper adopts a novel approach based on two-dimensional PPG signal processing. By applying Continuous Wavelet Transform (CWT), the preprocessed one-dimensional PPG signal is transformed into a two-dimensional time-frequency map, which not only preserves the time-frequency characteristics of the signal but also provides richer spatial information. During the feature extraction process, the SENet module is first introduced to enhance the ability to extract distinctive features. Next, a novel Lightweight Multi-Scale Feature Fusion (LMSFF) module is proposed, which addresses the limitation of single-scale feature extraction in existing methods by employing parallel multi-scale convolutional operations. Finally, cross-stage feature fusion is implemented, overcoming the limitations of traditional feature fusion methods. These techniques work synergistically to improve the model’s performance. On the BIDMC dataset, the MSF-SE ResNet50 model achieved accuracy, precision, recall, and F1 scores of 98.41%, 98.19%, 98.27%, and 98.23%, respectively. Compared to existing state-of-the-art methods, the proposed model demonstrates significant improvements across all evaluation metrics, highlighting its significance in terms of network architecture and performance. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

24 pages, 5022 KiB  
Article
Aging-Invariant Sheep Face Recognition Through Feature Decoupling
by Suhui Liu, Chuanzhong Xuan, Zhaohui Tang, Guangpu Wang, Xinyu Gao and Zhipan Wang
Animals 2025, 15(15), 2299; https://doi.org/10.3390/ani15152299 - 6 Aug 2025
Abstract
Precise recognition of individual ovine specimens plays a pivotal role in implementing smart agricultural platforms and optimizing herd management systems. With the development of deep learning technology, sheep face recognition provides an efficient and contactless solution for individual sheep identification. However, with the [...] Read more.
Precise recognition of individual ovine specimens plays a pivotal role in implementing smart agricultural platforms and optimizing herd management systems. With the development of deep learning technology, sheep face recognition provides an efficient and contactless solution for individual sheep identification. However, with the growth of sheep, their facial features keep changing, which poses challenges for existing sheep face recognition models to maintain accuracy across the dynamic changes in facial features over time, making it difficult to meet practical needs. To address this limitation, we propose the lifelong biometric learning of the sheep face network (LBL-SheepNet), a feature decoupling network designed for continuous adaptation to ovine facial changes, and constructed a dataset of 31,200 images from 55 sheep tracked monthly from 1 to 12 months of age. The LBL-SheepNet model addresses dynamic variations in facial features during sheep growth through a multi-module architectural framework. Firstly, a Squeeze-and-Excitation (SE) module enhances discriminative feature representation through adaptive channel-wise recalibration. Then, a nonlinear feature decoupling module employs a hybrid channel-batch attention mechanism to separate age-related features from identity-specific characteristics. Finally, a correlation analysis module utilizes adversarial learning to suppress age-biased feature interference, ensuring focus on age-invariant identifiers. Experimental results demonstrate that LBL-SheepNet achieves 95.5% identification accuracy and 95.3% average precision on the sheep face dataset. This study introduces a lifelong biometric learning (LBL) mechanism to mitigate recognition accuracy degradation caused by dynamic facial feature variations in growing sheep. By designing a feature decoupling network integrated with adversarial age-invariant learning, the proposed method addresses the performance limitations of existing models in long-term individual identification. Full article
(This article belongs to the Section Animal System and Management)
Show Figures

Figure 1

15 pages, 4422 KiB  
Article
Advanced Deep Learning Methods to Generate and Discriminate Fake Images of Egyptian Monuments
by Daniyah Alaswad and Mohamed A. Zohdy
Appl. Sci. 2025, 15(15), 8670; https://doi.org/10.3390/app15158670 - 5 Aug 2025
Abstract
Artificial intelligence technologies, particularly machine learning and computer vision, are being increasingly utilized to preserve, restore, and create immersive virtual experiences with cultural artifacts and sites, thus aiding in conserving cultural heritage and making it accessible to a global audience. This paper examines [...] Read more.
Artificial intelligence technologies, particularly machine learning and computer vision, are being increasingly utilized to preserve, restore, and create immersive virtual experiences with cultural artifacts and sites, thus aiding in conserving cultural heritage and making it accessible to a global audience. This paper examines the performance of Generative Adversarial Networks (GAN), especially Style-Based Generator Architecture (StyleGAN), as a deep learning approach for producing realistic images of Egyptian monuments. We used Sigmoid loss for Language–Image Pre-training (SigLIP) as a unique image–text alignment system to guide monument generation through semantic elements. We also studied truncation methods to regulate the generated image noise and identify the most effective parameter settings based on architectural representation versus diverse output creation. An improved discriminator design that combined noise addition with squeeze-and-excitation blocks and a modified MinibatchStdLayer produced 27.5% better Fréchet Inception Distance performance than the original discriminator models. Moreover, differential evolution for latent-space optimization reduced alignment mistakes during specific monument construction tasks by about 15%. We checked a wide range of truncation values from 0.1 to 1.0 and found that somewhere between 0.4 and 0.7 was the best range because it allowed for good accuracy while retaining many different architectural elements. Our findings indicate that specific model optimization strategies produce superior outcomes by creating better-quality and historically correct representations of diverse Egyptian monuments. Thus, the developed technology may be instrumental in generating educational and archaeological visualization assets while adding virtual tourism capabilities. Full article
(This article belongs to the Special Issue Novel Applications of Machine Learning and Bayesian Optimization)
Show Figures

Figure 1

24 pages, 29785 KiB  
Article
Multi-Scale Feature Extraction with 3D Complex-Valued Network for PolSAR Image Classification
by Nana Jiang, Wenbo Zhao, Jiao Guo, Qiang Zhao and Jubo Zhu
Remote Sens. 2025, 17(15), 2663; https://doi.org/10.3390/rs17152663 - 1 Aug 2025
Viewed by 233
Abstract
Compared to traditional real-valued neural networks, which process only amplitude information, complex-valued neural networks handle both amplitude and phase information, leading to superior performance in polarimetric synthetic aperture radar (PolSAR) image classification tasks. This paper proposes a multi-scale feature extraction (MSFE) method based [...] Read more.
Compared to traditional real-valued neural networks, which process only amplitude information, complex-valued neural networks handle both amplitude and phase information, leading to superior performance in polarimetric synthetic aperture radar (PolSAR) image classification tasks. This paper proposes a multi-scale feature extraction (MSFE) method based on a 3D complex-valued network to improve classification accuracy by fully leveraging multi-scale features, including phase information. We first designed a complex-valued three-dimensional network framework combining complex-valued 3D convolution (CV-3DConv) with complex-valued squeeze-and-excitation (CV-SE) modules. This framework is capable of simultaneously capturing spatial and polarimetric features, including both amplitude and phase information, from PolSAR images. Furthermore, to address robustness degradation from limited labeled samples, we introduced a multi-scale learning strategy that jointly models global and local features. Specifically, global features extract overall semantic information, while local features help the network capture region-specific semantics. This strategy enhances information utilization by integrating multi-scale receptive fields, complementing feature advantages. Extensive experiments on four benchmark datasets demonstrated that the proposed method outperforms various comparison methods, maintaining high classification accuracy across different sampling rates, thus validating its effectiveness and robustness. Full article
Show Figures

Graphical abstract

26 pages, 1790 KiB  
Article
A Hybrid Deep Learning Model for Aromatic and Medicinal Plant Species Classification Using a Curated Leaf Image Dataset
by Shareena E. M., D. Abraham Chandy, Shemi P. M. and Alwin Poulose
AgriEngineering 2025, 7(8), 243; https://doi.org/10.3390/agriengineering7080243 - 1 Aug 2025
Viewed by 249
Abstract
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the [...] Read more.
In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the lack of domain-specific, high-quality datasets and the limited representational capacity of traditional architectures. This study addresses these challenges by introducing a novel, well-curated leaf image dataset consisting of 39 classes of medicinal and aromatic plants collected from the Aromatic and Medicinal Plant Research Station in Odakkali, Kerala, India. To overcome performance bottlenecks observed with a baseline Convolutional Neural Network (CNN) that achieved only 44.94% accuracy, we progressively enhanced model performance through a series of architectural innovations. These included the use of a pre-trained VGG16 network, data augmentation techniques, and fine-tuning of deeper convolutional layers, followed by the integration of Squeeze-and-Excitation (SE) attention blocks. Ultimately, we propose a hybrid deep learning architecture that combines VGG16 with Batch Normalization, Gated Recurrent Units (GRUs), Transformer modules, and Dilated Convolutions. This final model achieved a peak validation accuracy of 95.24%, significantly outperforming several baseline models, such as custom CNN (44.94%), VGG-19 (59.49%), VGG-16 before augmentation (71.52%), Xception (85.44%), Inception v3 (87.97%), VGG-16 after data augumentation (89.24%), VGG-16 after fine-tuning (90.51%), MobileNetV2 (93.67), and VGG16 with SE block (94.94%). These results demonstrate superior capability in capturing both local textures and global morphological features. The proposed solution not only advances the state of the art in plant classification but also contributes a valuable dataset to the research community. Its real-world applicability spans field-based plant identification, biodiversity conservation, and precision agriculture, offering a scalable tool for automated plant recognition in complex ecological and agricultural environments. Full article
(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)
Show Figures

Figure 1

21 pages, 4147 KiB  
Article
OLTEM: Lumped Thermal and Deep Neural Model for PMSM Temperature
by Yuzhong Sheng, Xin Liu, Qi Chen, Zhenghao Zhu, Chuangxin Huang and Qiuliang Wang
AI 2025, 6(8), 173; https://doi.org/10.3390/ai6080173 - 31 Jul 2025
Viewed by 288
Abstract
Background and Objective: Temperature management is key for reliable operation of permanent magnet synchronous motors (PMSMs). The lumped-parameter thermal network (LPTN) is fast and interpretable but struggles with nonlinear behavior under high power density. We propose OLTEM, a physics-informed deep model that combines [...] Read more.
Background and Objective: Temperature management is key for reliable operation of permanent magnet synchronous motors (PMSMs). The lumped-parameter thermal network (LPTN) is fast and interpretable but struggles with nonlinear behavior under high power density. We propose OLTEM, a physics-informed deep model that combines LPTN with a thermal neural network (TNN) to improve prediction accuracy while keeping physical meaning. Methods: OLTEM embeds LPTN into a recurrent state-space formulation and learns three parameter sets: thermal conductance, inverse thermal capacitance, and power loss. Two additions are introduced: (i) a state-conditioned squeeze-and-excitation (SC-SE) attention that adapts feature weights using the current temperature state, and (ii) an enhanced power-loss sub-network that uses a deep MLP with SC-SE and non-negativity constraints. The model is trained and evaluated on the public Electric Motor Temperature dataset (Paderborn University/Kaggle). Performance is measured by mean squared error (MSE) and maximum absolute error across permanent-magnet, stator-yoke, stator-tooth, and stator-winding temperatures. Results: OLTEM tracks fast thermal transients and yields lower MSE than both the baseline TNN and a CNN–RNN model for all four components. On a held-out generalization set, MSE remains below 4.0 °C2 and the maximum absolute error is about 4.3–8.2 °C. Ablation shows that removing either SC-SE or the enhanced power-loss module degrades accuracy, confirming their complementary roles. Conclusions: By combining physics with learned attention and loss modeling, OLTEM improves PMSM temperature prediction while preserving interpretability. This approach can support motor thermal design and control; future work will study transfer to other machines and further reduce short-term errors during abrupt operating changes. Full article
Show Figures

Figure 1

27 pages, 5740 KiB  
Article
Localization of Multiple GNSS Interference Sources Based on Target Detection in C/N0 Distribution Maps
by Qidong Chen, Rui Liu, Qiuzhen Yan, Yue Xu, Yang Liu, Xiao Huang and Ying Zhang
Remote Sens. 2025, 17(15), 2627; https://doi.org/10.3390/rs17152627 - 29 Jul 2025
Viewed by 268
Abstract
The localization of multiple interference sources in Global Navigation Satellite Systems (GNSS) can be achieved using carrier-to-noise ratio (C/N0) information provided by GNSS receivers, such as those embedded in smartphones. However, in increasingly prevalent complex scenarios—such as the coexistence of multiple [...] Read more.
The localization of multiple interference sources in Global Navigation Satellite Systems (GNSS) can be achieved using carrier-to-noise ratio (C/N0) information provided by GNSS receivers, such as those embedded in smartphones. However, in increasingly prevalent complex scenarios—such as the coexistence of multiple directional interferences, increased diversity and density of GNSS interference, and the presence of multiple low-power interference sources—conventional localization methods often fail to provide reliable results, thereby limiting their applicability in real-world environments. This paper presents a multi-interference sources localization method using object detection in GNSS C/N0 distribution maps. The proposed method first exploits the similarity between C/N0 data reported by GNSS receivers and image grayscale values to construct C/N0 distribution maps, thereby transforming the problem of multi-source GNSS interference localization into an object detection and localization task based on image processing techniques. Subsequently, an Oriented Squeeze-and-Excitation-based Faster Region-based Convolutional Neural Network (OSF-RCNN) framework is proposed to process the C/N0 distribution maps. Building upon the Faster R-CNN framework, the proposed method integrates an Oriented RPN (Region Proposal Network) to regress the orientation angles of directional antennas, effectively addressing their rotational characteristics. Additionally, the Squeeze-and-Excitation (SE) mechanism and the Feature Pyramid Network (FPN) are integrated at key stages of the network to improve sensitivity to small targets, thereby enhancing detection and localization performance for low-power interference sources. The simulation results verify the effectiveness of the proposed method in accurately localizing multiple interference sources under the increasingly prevalent complex scenarios described above. Full article
(This article belongs to the Special Issue Advanced Multi-GNSS Positioning and Its Applications in Geoscience)
Show Figures

Figure 1

25 pages, 7623 KiB  
Article
ASHM-YOLOv9: A Detection Model for Strawberry in Greenhouses at Multiple Stages
by Yan Mo, Shaowei Bai and Wei Chen
Appl. Sci. 2025, 15(15), 8244; https://doi.org/10.3390/app15158244 - 24 Jul 2025
Viewed by 320
Abstract
Strawberry planting requires different amounts of soil water-holding capacity and fertilizer at different growth stages. Determining the stages of strawberry growth has important guiding significance for irrigation, fertilization, and picking. Quick and accurate identification of strawberry plants at different stages can provide important [...] Read more.
Strawberry planting requires different amounts of soil water-holding capacity and fertilizer at different growth stages. Determining the stages of strawberry growth has important guiding significance for irrigation, fertilization, and picking. Quick and accurate identification of strawberry plants at different stages can provide important information for automated strawberry planting management. We propose an improved multistage identification model for strawberry based on the YOLOv9 algorithm—the ASHM-YOLOv9 model. The original YOLOv9 showed limitations in detecting strawberries at different growth stages, particularly lower precision in identifying occluded fruits and immature stages. We enhanced the YOLOv9 model by introducing the Alterable Kernel Convolution (AKConv) to improve the recognition efficiency while ensuring precision. The squeeze-and-excitation (SE) network was added to increase the network’s capacity for characteristic derivation and its ability to fuse features. Haar wavelet downsampling (HWD) was applied to optimize the Adaptive Downsampling module (Adown) of the initial model, thereby increasing the precision of object detection. Finally, the CIoU function was replaced by the Minimum Point Distance based IoU (MPDIoU) loss function to effectively solve the problem of low precision in identifying bounding boxes. The experimental results demonstrate that, under identical conditions, the improved model achieves a precision of 97.7%, a recall of 97.2%, mAP50 of 99.1%, and mAP50-95 of 90.7%, which are 0.6%, 3.0%, 0.7%, and 7.4% greater than those of the original model, respectively. The parameters, model size, and floating-point calculations were reduced by 3.7%, 5.6% and 3.8%, respectively, which significantly boosted the performance of the original model and outperformed that of the other models. Experiments revealed that the model could provide technical support for the multistage identification of strawberry planting. Full article
Show Figures

Figure 1

27 pages, 5193 KiB  
Article
Fault Diagnosis Method of Plunger Pump Based on Meta-Learning and Improved Multi-Channel Convolutional Neural Network Under Small Sample Condition
by Xiwang Yang, Jiancheng Ma, Hongjun Hu, Jinying Huang and Licheng Jing
Sensors 2025, 25(15), 4587; https://doi.org/10.3390/s25154587 - 24 Jul 2025
Viewed by 194
Abstract
A fault diagnosis method based on meta-learning and an improved multi-channel convolutional neural network (MAML-MCCNN-ISENet) was proposed to solve the problems of insufficient feature extraction and low fault type identification accuracy of vibration signals at small sample sizes. The signal is first preprocessed [...] Read more.
A fault diagnosis method based on meta-learning and an improved multi-channel convolutional neural network (MAML-MCCNN-ISENet) was proposed to solve the problems of insufficient feature extraction and low fault type identification accuracy of vibration signals at small sample sizes. The signal is first preprocessed using adaptive chirp mode decomposition (ACMD) methods. A multi-channel input structure is then employed to process the multidimensional signal information after preprocessing. The improved squeeze and excitation networks (ISENets) have been enhanced to concurrently enhance the network’s adaptive perception of the significance of each channel feature. On this basis, a meta-learning strategy is introduced, the learning process of model initialization parameters is improved, the network is optimized by a multi-task learning mechanism, and the initial parameters of the diagnosis model are adaptively adjusted, so that the model can quickly adapt to new fault diagnosis tasks on limited datasets. Then, the overfitting problem under small sample conditions is alleviated, and the accuracy and robustness of fault identification are improved. Finally, the performance of the model is verified on the experimental data of the fault diagnosis of the laboratory plunger pump and the vibration dataset of the centrifugal pump of the Saint Longoval Institute of Engineering and Technology. The results show that the diagnostic accuracy of the proposed method for various diagnostic tasks can reach more than 90% on small samples. Full article
(This article belongs to the Section Fault Diagnosis & Sensors)
Show Figures

Figure 1

19 pages, 5417 KiB  
Article
SE-TFF: Adaptive Tourism-Flow Forecasting Under Sparse and Heterogeneous Data via Multi-Scale SE-Net
by Jinyuan Zhang, Tao Cui and Peng He
Appl. Sci. 2025, 15(15), 8189; https://doi.org/10.3390/app15158189 - 23 Jul 2025
Viewed by 217
Abstract
Accurate and timely forecasting of cross-regional tourist flows is essential for sustainable destination management, yet existing models struggle with sparse data, complex spatiotemporal interactions, and limited interpretability. This paper presents SE-TFF, a multi-scale tourism-flow forecasting framework that couples a Squeeze-and-Excitation (SE) network with [...] Read more.
Accurate and timely forecasting of cross-regional tourist flows is essential for sustainable destination management, yet existing models struggle with sparse data, complex spatiotemporal interactions, and limited interpretability. This paper presents SE-TFF, a multi-scale tourism-flow forecasting framework that couples a Squeeze-and-Excitation (SE) network with reinforcement-driven optimization to adaptively re-weight environmental, economic, and social features. A benchmark dataset of 17.8 million records from 64 countries and 743 cities (2016–2024) is compiled from the Open Travel Data repository in github (OPTD) for training and validation. SE-TFF introduces (i) a multi-channel SE module for fine-grained feature selection under heterogeneous conditions, (ii) a Top-K attention filter to preserve salient context in highly sparse matrices, and (iii) a Double-DQN layer that dynamically balances prediction objectives. Experimental results show SE-TFF attains 56.5% MAE and 65.6% RMSE reductions over the best baseline (ARIMAX) at 20% sparsity, with 0.92 × 103 average MAE across multi-task outputs. SHAP analysis ranks climate anomalies, tourism revenue, and employment as dominant predictors. These gains demonstrate SE-TFF’s ability to deliver real-time, interpretable forecasts for data-limited destinations. Future work will incorporate real-time social media signals and larger multimodal datasets to enhance generalizability. Full article
Show Figures

Figure 1

15 pages, 4874 KiB  
Article
A Novel 3D Convolutional Neural Network-Based Deep Learning Model for Spatiotemporal Feature Mapping for Video Analysis: Feasibility Study for Gastrointestinal Endoscopic Video Classification
by Mrinal Kanti Dhar, Mou Deb, Poonguzhali Elangovan, Keerthy Gopalakrishnan, Divyanshi Sood, Avneet Kaur, Charmy Parikh, Swetha Rapolu, Gianeshwaree Alias Rachna Panjwani, Rabiah Aslam Ansari, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Scott A. Helgeson, Venkata S. Akshintala and Shivaram P. Arunachalam
J. Imaging 2025, 11(7), 243; https://doi.org/10.3390/jimaging11070243 - 18 Jul 2025
Viewed by 473
Abstract
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static [...] Read more.
Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static images, overlooking critical temporal cues present in video data. To bridge this gap, a novel DL-based framework is proposed for spatiotemporal feature extraction from medical video sequences. As a feasibility use case, this study focuses on gastrointestinal (GI) endoscopic video classification. A 3D convolutional neural network (CNN) is developed to classify upper and lower GI endoscopic videos using the hyperKvasir dataset, which contains 314 lower and 60 upper GI videos. To address data imbalance, 60 matched pairs of videos are randomly selected across 20 experimental runs. Videos are resized to 224 × 224, and the 3D CNN captures spatiotemporal information. A 3D version of the parallel spatial and channel squeeze-and-excitation (P-scSE) is implemented, and a new block called the residual with parallel attention (RPA) block is proposed by combining P-scSE3D with a residual block. To reduce computational complexity, a (2 + 1)D convolution is used in place of full 3D convolution. The model achieves an average accuracy of 0.933, precision of 0.932, recall of 0.944, F1-score of 0.935, and AUC of 0.933. It is also observed that the integration of P-scSE3D increased the F1-score by 7%. This preliminary work opens avenues for exploring various GI endoscopic video-based prospective studies. Full article
Show Figures

Figure 1

20 pages, 4616 KiB  
Article
Temporal Convolutional Network with Attention Mechanisms for Strong Wind Early Warning in High-Speed Railway Systems
by Wei Gu, Guoyuan Yang, Hongyan Xing, Yajing Shi and Tongyuan Liu
Sustainability 2025, 17(14), 6339; https://doi.org/10.3390/su17146339 - 10 Jul 2025
Viewed by 405
Abstract
High-speed railway (HSR) is a key transport mode for achieving carbon reduction targets and promoting sustainable regional economic development due to its fast, efficient, and low-carbon nature. Accurate wind speed forecasting (WSF) is vital for HSR systems, as it provides future wind conditions [...] Read more.
High-speed railway (HSR) is a key transport mode for achieving carbon reduction targets and promoting sustainable regional economic development due to its fast, efficient, and low-carbon nature. Accurate wind speed forecasting (WSF) is vital for HSR systems, as it provides future wind conditions that are critical for ensuring safe train operations. Numerous WSF schemes based on deep learning have been proposed. However, accurately forecasting strong wind events remains challenging due to the complex and dynamic nature of wind. In this study, we propose a novel hybrid network architecture, MHSETCN-LSTM, for forecasting strong wind. The MHSETCN-LSTM integrates temporal convolutional networks (TCNs) and long short-term memory networks (LSTMs) to capture both short-term fluctuations and long-term trends in wind behavior. The multi-head squeeze-and-excitation (MHSE) attention mechanism dynamically recalibrates the importance of different aspects of the input sequence, allowing the model to focus on critical time steps, particularly when abrupt wind events occur. In addition to wind speed, we introduce wind direction (WD) to characterize wind behavior due to its impact on the aerodynamic forces acting on trains. To maintain the periodicity of WD, we employ a triangular transform to predict the sine and cosine values of WD, improving the reliability of predictions. Massive experiments are conducted to evaluate the effectiveness of the proposed method based on real-world wind data collected from sensors along the Beijing–Baotou railway. Experimental results demonstrated that our model outperforms state-of-the-art solutions for WSF, achieving a mean-squared error (MSE) of 0.0393, a root-mean-squared error (RMSE) of 0.1982, and a coefficient of determination (R2) of 99.59%. These experimental results validate the efficacy of our proposed model in enhancing the resilience and sustainability of railway infrastructure.Furthermore, the model can be utilized in other wind-sensitive sectors, such as highways, ports, and offshore wind operations. This will further promote the achievement of Sustainable Development Goal 9. Full article
(This article belongs to the Section Environmental Sustainability and Applications)
Show Figures

Figure 1

18 pages, 70320 KiB  
Article
RIS-UNet: A Multi-Level Hierarchical Framework for Liver Tumor Segmentation in CT Images
by Yuchai Wan, Lili Zhang and Murong Wang
Entropy 2025, 27(7), 735; https://doi.org/10.3390/e27070735 - 9 Jul 2025
Viewed by 438
Abstract
The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we [...] Read more.
The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we propose a novel multi-level hierarchical framework for liver tumor segmentation. In the first level, we integrate inter-slice spatial information by a 2.5D network to resolve the accuracy–efficiency trade-off inherent in conventional 2D/3D segmentation strategies for liver tumor segmentation. Then, the second level extracts the inner-slice global and local features for enhancing feature representation. We propose the Res-Inception-SE Block, which combines residual connections, multi-scale Inception modules, and squeeze-excitation attention to capture comprehensive global and local features. Furthermore, we design a hybrid loss function combining Binary Cross Entropy (BCE) and Dice loss to solve the category imbalance problem and accelerate convergence. Extensive experiments on the LiTS17 dataset demonstrate the effectiveness of our method on accuracy, efficiency, and visual results for liver tumor segmentation. Full article
(This article belongs to the Special Issue Cutting-Edge AI in Computational Bioinformatics)
Show Figures

Figure 1

21 pages, 3406 KiB  
Article
ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification
by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu
Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025
Viewed by 584
Abstract
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.
Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article
Show Figures

Figure 1

20 pages, 1935 KiB  
Article
Residual Attention Network with Atrous Spatial Pyramid Pooling for Soil Element Estimation in LUCAS Hyperspectral Data
by Yun Deng, Yuchen Cao, Shouxue Chen and Xiaohui Cheng
Appl. Sci. 2025, 15(13), 7457; https://doi.org/10.3390/app15137457 - 3 Jul 2025
Viewed by 311
Abstract
Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address [...] Read more.
Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address these challenges, we propose ReSE-AP Net, a multi-scale attention residual network with spatial pyramid pooling. Built on convolutional residual blocks, the model incorporates a squeeze-and-excitation channel attention mechanism to recalibrate feature weights and an atrous spatial pyramid pooling (ASPP) module to extract multi-resolution spectral features. This architecture synergistically represents weak absorption peaks (400–1000 nm) and broad spectral bands (1000–2500 nm), overcoming single-scale modeling limitations. Validation on the LUCAS2009 dataset demonstrated that ReSE-AP Net outperformed conventional machine learning by improving the R2 by 2.8–36.5% and reducing the RMSE by 14.2–69.2%. Compared with existing deep learning methods, it increased the R2 by 0.4–25.5% for clay, silt, sand, organic carbon, calcium carbonate, and phosphorus predictions, and decreased the RMSE by 0.7–39.0%. Our contributions include statistical analysis of LUCAS2009 spectra, identification of conventional method limitations, development of the ReSE-AP Net model, ablation studies, and comprehensive comparisons with alternative approaches. Full article
Show Figures

Figure 1

Back to TopTop