MDPI - Publisher of Open Access Journals

24 pages, 29785 KiB

Open AccessArticle

Multi-Scale Feature Extraction with 3D Complex-Valued Network for PolSAR Image Classification

by Nana Jiang, Wenbo Zhao, Jiao Guo, Qiang Zhao and Jubo Zhu

Remote Sens. 2025, 17(15), 2663; https://doi.org/10.3390/rs17152663 (registering DOI) - 1 Aug 2025

Viewed by 50

Compared to traditional real-valued neural networks, which process only amplitude information, complex-valued neural networks handle both amplitude and phase information, leading to superior performance in polarimetric synthetic aperture radar (PolSAR) image classification tasks. This paper proposes a multi-scale feature extraction (MSFE) method based [...] Read more.

Compared to traditional real-valued neural networks, which process only amplitude information, complex-valued neural networks handle both amplitude and phase information, leading to superior performance in polarimetric synthetic aperture radar (PolSAR) image classification tasks. This paper proposes a multi-scale feature extraction (MSFE) method based on a 3D complex-valued network to improve classification accuracy by fully leveraging multi-scale features, including phase information. We first designed a complex-valued three-dimensional network framework combining complex-valued 3D convolution (CV-3DConv) with complex-valued squeeze-and-excitation (CV-SE) modules. This framework is capable of simultaneously capturing spatial and polarimetric features, including both amplitude and phase information, from PolSAR images. Furthermore, to address robustness degradation from limited labeled samples, we introduced a multi-scale learning strategy that jointly models global and local features. Specifically, global features extract overall semantic information, while local features help the network capture region-specific semantics. This strategy enhances information utilization by integrating multi-scale receptive fields, complementing feature advantages. Extensive experiments on four benchmark datasets demonstrated that the proposed method outperforms various comparison methods, maintaining high classification accuracy across different sampling rates, thus validating its effectiveness and robustness. Full article

(This article belongs to the Special Issue Advances in AI-Driven Synthetic Aperture Radar (SAR): Data Processing to Automatic Interpretation)

► Show Figures

Figure 1

26 pages, 1790 KiB

Open AccessArticle

A Hybrid Deep Learning Model for Aromatic and Medicinal Plant Species Classification Using a Curated Leaf Image Dataset

by Shareena E. M., D. Abraham Chandy, Shemi P. M. and Alwin Poulose

AgriEngineering 2025, 7(8), 243; https://doi.org/10.3390/agriengineering7080243 - 1 Aug 2025

Viewed by 84

Abstract

In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the [...] Read more.

In the era of smart agriculture, accurate identification of plant species is critical for effective crop management, biodiversity monitoring, and the sustainable use of medicinal resources. However, existing deep learning approaches often underperform when applied to fine-grained plant classification tasks due to the lack of domain-specific, high-quality datasets and the limited representational capacity of traditional architectures. This study addresses these challenges by introducing a novel, well-curated leaf image dataset consisting of 39 classes of medicinal and aromatic plants collected from the Aromatic and Medicinal Plant Research Station in Odakkali, Kerala, India. To overcome performance bottlenecks observed with a baseline Convolutional Neural Network (CNN) that achieved only 44.94% accuracy, we progressively enhanced model performance through a series of architectural innovations. These included the use of a pre-trained VGG16 network, data augmentation techniques, and fine-tuning of deeper convolutional layers, followed by the integration of Squeeze-and-Excitation (SE) attention blocks. Ultimately, we propose a hybrid deep learning architecture that combines VGG16 with Batch Normalization, Gated Recurrent Units (GRUs), Transformer modules, and Dilated Convolutions. This final model achieved a peak validation accuracy of 95.24%, significantly outperforming several baseline models, such as custom CNN (44.94%), VGG-19 (59.49%), VGG-16 before augmentation (71.52%), Xception (85.44%), Inception v3 (87.97%), VGG-16 after data augumentation (89.24%), VGG-16 after fine-tuning (90.51%), MobileNetV2 (93.67), and VGG16 with SE block (94.94%). These results demonstrate superior capability in capturing both local textures and global morphological features. The proposed solution not only advances the state of the art in plant classification but also contributes a valuable dataset to the research community. Its real-world applicability spans field-based plant identification, biodiversity conservation, and precision agriculture, offering a scalable tool for automated plant recognition in complex ecological and agricultural environments. Full article

(This article belongs to the Special Issue Implementation of Artificial Intelligence in Agriculture)

► Show Figures

Figure 1

21 pages, 4147 KiB

Open AccessArticle

OLTEM: Lumped Thermal and Deep Neural Model for PMSM Temperature

by Yuzhong Sheng, Xin Liu, Qi Chen, Zhenghao Zhu, Chuangxin Huang and Qiuliang Wang

AI 2025, 6(8), 173; https://doi.org/10.3390/ai6080173 - 31 Jul 2025

Viewed by 161

Abstract

Background and Objective: Temperature management is key for reliable operation of permanent magnet synchronous motors (PMSMs). The lumped-parameter thermal network (LPTN) is fast and interpretable but struggles with nonlinear behavior under high power density. We propose OLTEM, a physics-informed deep model that combines [...] Read more.

Background and Objective: Temperature management is key for reliable operation of permanent magnet synchronous motors (PMSMs). The lumped-parameter thermal network (LPTN) is fast and interpretable but struggles with nonlinear behavior under high power density. We propose OLTEM, a physics-informed deep model that combines LPTN with a thermal neural network (TNN) to improve prediction accuracy while keeping physical meaning. Methods: OLTEM embeds LPTN into a recurrent state-space formulation and learns three parameter sets: thermal conductance, inverse thermal capacitance, and power loss. Two additions are introduced: (i) a state-conditioned squeeze-and-excitation (SC-SE) attention that adapts feature weights using the current temperature state, and (ii) an enhanced power-loss sub-network that uses a deep MLP with SC-SE and non-negativity constraints. The model is trained and evaluated on the public Electric Motor Temperature dataset (Paderborn University/Kaggle). Performance is measured by mean squared error (MSE) and maximum absolute error across permanent-magnet, stator-yoke, stator-tooth, and stator-winding temperatures. Results: OLTEM tracks fast thermal transients and yields lower MSE than both the baseline TNN and a CNN–RNN model for all four components. On a held-out generalization set, MSE remains below 4.0 °C² and the maximum absolute error is about 4.3–8.2 °C. Ablation shows that removing either SC-SE or the enhanced power-loss module degrades accuracy, confirming their complementary roles. Conclusions: By combining physics with learned attention and loss modeling, OLTEM improves PMSM temperature prediction while preserving interpretability. This approach can support motor thermal design and control; future work will study transfer to other machines and further reduce short-term errors during abrupt operating changes. Full article

► Show Figures

Figure 1

27 pages, 5740 KiB

Open AccessFeature PaperArticle

Localization of Multiple GNSS Interference Sources Based on Target Detection in C/N₀ Distribution Maps

by Qidong Chen, Rui Liu, Qiuzhen Yan, Yue Xu, Yang Liu, Xiao Huang and Ying Zhang

Remote Sens. 2025, 17(15), 2627; https://doi.org/10.3390/rs17152627 - 29 Jul 2025

Viewed by 245

Abstract

The localization of multiple interference sources in Global Navigation Satellite Systems (GNSS) can be achieved using carrier-to-noise ratio (C/N₀) information provided by GNSS receivers, such as those embedded in smartphones. However, in increasingly prevalent complex scenarios—such as the coexistence of multiple [...] Read more.

The localization of multiple interference sources in Global Navigation Satellite Systems (GNSS) can be achieved using carrier-to-noise ratio (C/N₀) information provided by GNSS receivers, such as those embedded in smartphones. However, in increasingly prevalent complex scenarios—such as the coexistence of multiple directional interferences, increased diversity and density of GNSS interference, and the presence of multiple low-power interference sources—conventional localization methods often fail to provide reliable results, thereby limiting their applicability in real-world environments. This paper presents a multi-interference sources localization method using object detection in GNSS C/N₀ distribution maps. The proposed method first exploits the similarity between C/N₀ data reported by GNSS receivers and image grayscale values to construct C/N₀ distribution maps, thereby transforming the problem of multi-source GNSS interference localization into an object detection and localization task based on image processing techniques. Subsequently, an Oriented Squeeze-and-Excitation-based Faster Region-based Convolutional Neural Network (OSF-RCNN) framework is proposed to process the C/N₀ distribution maps. Building upon the Faster R-CNN framework, the proposed method integrates an Oriented RPN (Region Proposal Network) to regress the orientation angles of directional antennas, effectively addressing their rotational characteristics. Additionally, the Squeeze-and-Excitation (SE) mechanism and the Feature Pyramid Network (FPN) are integrated at key stages of the network to improve sensitivity to small targets, thereby enhancing detection and localization performance for low-power interference sources. The simulation results verify the effectiveness of the proposed method in accurately localizing multiple interference sources under the increasingly prevalent complex scenarios described above. Full article

(This article belongs to the Special Issue Advanced Multi-GNSS Positioning and Its Applications in Geoscience)

► Show Figures

Figure 1

25 pages, 7623 KiB

Open AccessArticle

ASHM-YOLOv9: A Detection Model for Strawberry in Greenhouses at Multiple Stages

by Yan Mo, Shaowei Bai and Wei Chen

Appl. Sci. 2025, 15(15), 8244; https://doi.org/10.3390/app15158244 - 24 Jul 2025

Viewed by 305

Abstract

Strawberry planting requires different amounts of soil water-holding capacity and fertilizer at different growth stages. Determining the stages of strawberry growth has important guiding significance for irrigation, fertilization, and picking. Quick and accurate identification of strawberry plants at different stages can provide important [...] Read more.

Strawberry planting requires different amounts of soil water-holding capacity and fertilizer at different growth stages. Determining the stages of strawberry growth has important guiding significance for irrigation, fertilization, and picking. Quick and accurate identification of strawberry plants at different stages can provide important information for automated strawberry planting management. We propose an improved multistage identification model for strawberry based on the YOLOv9 algorithm—the ASHM-YOLOv9 model. The original YOLOv9 showed limitations in detecting strawberries at different growth stages, particularly lower precision in identifying occluded fruits and immature stages. We enhanced the YOLOv9 model by introducing the Alterable Kernel Convolution (AKConv) to improve the recognition efficiency while ensuring precision. The squeeze-and-excitation (SE) network was added to increase the network’s capacity for characteristic derivation and its ability to fuse features. Haar wavelet downsampling (HWD) was applied to optimize the Adaptive Downsampling module (Adown) of the initial model, thereby increasing the precision of object detection. Finally, the CIoU function was replaced by the Minimum Point Distance based IoU (MPDIoU) loss function to effectively solve the problem of low precision in identifying bounding boxes. The experimental results demonstrate that, under identical conditions, the improved model achieves a precision of 97.7%, a recall of 97.2%, mAP50 of 99.1%, and mAP50-95 of 90.7%, which are 0.6%, 3.0%, 0.7%, and 7.4% greater than those of the original model, respectively. The parameters, model size, and floating-point calculations were reduced by 3.7%, 5.6% and 3.8%, respectively, which significantly boosted the performance of the original model and outperformed that of the other models. Experiments revealed that the model could provide technical support for the multistage identification of strawberry planting. Full article

► Show Figures

Figure 1

27 pages, 5193 KiB

Open AccessArticle

Fault Diagnosis Method of Plunger Pump Based on Meta-Learning and Improved Multi-Channel Convolutional Neural Network Under Small Sample Condition

by Xiwang Yang, Jiancheng Ma, Hongjun Hu, Jinying Huang and Licheng Jing

Sensors 2025, 25(15), 4587; https://doi.org/10.3390/s25154587 - 24 Jul 2025

Viewed by 174

Abstract

A fault diagnosis method based on meta-learning and an improved multi-channel convolutional neural network (MAML-MCCNN-ISENet) was proposed to solve the problems of insufficient feature extraction and low fault type identification accuracy of vibration signals at small sample sizes. The signal is first preprocessed [...] Read more.

A fault diagnosis method based on meta-learning and an improved multi-channel convolutional neural network (MAML-MCCNN-ISENet) was proposed to solve the problems of insufficient feature extraction and low fault type identification accuracy of vibration signals at small sample sizes. The signal is first preprocessed using adaptive chirp mode decomposition (ACMD) methods. A multi-channel input structure is then employed to process the multidimensional signal information after preprocessing. The improved squeeze and excitation networks (ISENets) have been enhanced to concurrently enhance the network’s adaptive perception of the significance of each channel feature. On this basis, a meta-learning strategy is introduced, the learning process of model initialization parameters is improved, the network is optimized by a multi-task learning mechanism, and the initial parameters of the diagnosis model are adaptively adjusted, so that the model can quickly adapt to new fault diagnosis tasks on limited datasets. Then, the overfitting problem under small sample conditions is alleviated, and the accuracy and robustness of fault identification are improved. Finally, the performance of the model is verified on the experimental data of the fault diagnosis of the laboratory plunger pump and the vibration dataset of the centrifugal pump of the Saint Longoval Institute of Engineering and Technology. The results show that the diagnostic accuracy of the proposed method for various diagnostic tasks can reach more than 90% on small samples. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

► Show Figures

Figure 1

19 pages, 5417 KiB

Open AccessArticle

SE-TFF: Adaptive Tourism-Flow Forecasting Under Sparse and Heterogeneous Data via Multi-Scale SE-Net

by Jinyuan Zhang, Tao Cui and Peng He

Appl. Sci. 2025, 15(15), 8189; https://doi.org/10.3390/app15158189 - 23 Jul 2025

Viewed by 199

Abstract

Accurate and timely forecasting of cross-regional tourist flows is essential for sustainable destination management, yet existing models struggle with sparse data, complex spatiotemporal interactions, and limited interpretability. This paper presents SE-TFF, a multi-scale tourism-flow forecasting framework that couples a Squeeze-and-Excitation (SE) network with [...] Read more.

Accurate and timely forecasting of cross-regional tourist flows is essential for sustainable destination management, yet existing models struggle with sparse data, complex spatiotemporal interactions, and limited interpretability. This paper presents SE-TFF, a multi-scale tourism-flow forecasting framework that couples a Squeeze-and-Excitation (SE) network with reinforcement-driven optimization to adaptively re-weight environmental, economic, and social features. A benchmark dataset of 17.8 million records from 64 countries and 743 cities (2016–2024) is compiled from the Open Travel Data repository in github (OPTD) for training and validation. SE-TFF introduces (i) a multi-channel SE module for fine-grained feature selection under heterogeneous conditions, (ii) a Top-K attention filter to preserve salient context in highly sparse matrices, and (iii) a Double-DQN layer that dynamically balances prediction objectives. Experimental results show SE-TFF attains 56.5% MAE and 65.6% RMSE reductions over the best baseline (ARIMAX) at 20% sparsity, with 0.92 × 10³ average MAE across multi-task outputs. SHAP analysis ranks climate anomalies, tourism revenue, and employment as dominant predictors. These gains demonstrate SE-TFF’s ability to deliver real-time, interpretable forecasts for data-limited destinations. Future work will incorporate real-time social media signals and larger multimodal datasets to enhance generalizability. Full article

► Show Figures

Figure 1

15 pages, 4874 KiB

Open AccessArticle

A Novel 3D Convolutional Neural Network-Based Deep Learning Model for Spatiotemporal Feature Mapping for Video Analysis: Feasibility Study for Gastrointestinal Endoscopic Video Classification

by Mrinal Kanti Dhar, Mou Deb, Poonguzhali Elangovan, Keerthy Gopalakrishnan, Divyanshi Sood, Avneet Kaur, Charmy Parikh, Swetha Rapolu, Gianeshwaree Alias Rachna Panjwani, Rabiah Aslam Ansari, Naghmeh Asadimanesh, Shiva Sankari Karuppiah, Scott A. Helgeson, Venkata S. Akshintala and Shivaram P. Arunachalam

J. Imaging 2025, 11(7), 243; https://doi.org/10.3390/jimaging11070243 - 18 Jul 2025

Viewed by 440

Abstract

Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static [...] Read more.

Accurate analysis of medical videos remains a major challenge in deep learning (DL) due to the need for effective spatiotemporal feature mapping that captures both spatial detail and temporal dynamics. Despite advances in DL, most existing models in medical AI focus on static images, overlooking critical temporal cues present in video data. To bridge this gap, a novel DL-based framework is proposed for spatiotemporal feature extraction from medical video sequences. As a feasibility use case, this study focuses on gastrointestinal (GI) endoscopic video classification. A 3D convolutional neural network (CNN) is developed to classify upper and lower GI endoscopic videos using the hyperKvasir dataset, which contains 314 lower and 60 upper GI videos. To address data imbalance, 60 matched pairs of videos are randomly selected across 20 experimental runs. Videos are resized to 224 × 224, and the 3D CNN captures spatiotemporal information. A 3D version of the parallel spatial and channel squeeze-and-excitation (P-scSE) is implemented, and a new block called the residual with parallel attention (RPA) block is proposed by combining P-scSE3D with a residual block. To reduce computational complexity, a (2 + 1)D convolution is used in place of full 3D convolution. The model achieves an average accuracy of 0.933, precision of 0.932, recall of 0.944, F1-score of 0.935, and AUC of 0.933. It is also observed that the integration of P-scSE3D increased the F1-score by 7%. This preliminary work opens avenues for exploring various GI endoscopic video-based prospective studies. Full article

(This article belongs to the Special Issue Clinical and Pathological Imaging in the Era of Artificial Intelligence: New Insights and Perspectives—2nd Edition)

► Show Figures

Figure 1

20 pages, 4616 KiB

Open AccessArticle

Temporal Convolutional Network with Attention Mechanisms for Strong Wind Early Warning in High-Speed Railway Systems

by Wei Gu, Guoyuan Yang, Hongyan Xing, Yajing Shi and Tongyuan Liu

Sustainability 2025, 17(14), 6339; https://doi.org/10.3390/su17146339 - 10 Jul 2025

Viewed by 389

Abstract

High-speed railway (HSR) is a key transport mode for achieving carbon reduction targets and promoting sustainable regional economic development due to its fast, efficient, and low-carbon nature. Accurate wind speed forecasting (WSF) is vital for HSR systems, as it provides future wind conditions [...] Read more.

High-speed railway (HSR) is a key transport mode for achieving carbon reduction targets and promoting sustainable regional economic development due to its fast, efficient, and low-carbon nature. Accurate wind speed forecasting (WSF) is vital for HSR systems, as it provides future wind conditions that are critical for ensuring safe train operations. Numerous WSF schemes based on deep learning have been proposed. However, accurately forecasting strong wind events remains challenging due to the complex and dynamic nature of wind. In this study, we propose a novel hybrid network architecture, MHSETCN-LSTM, for forecasting strong wind. The MHSETCN-LSTM integrates temporal convolutional networks (TCNs) and long short-term memory networks (LSTMs) to capture both short-term fluctuations and long-term trends in wind behavior. The multi-head squeeze-and-excitation (MHSE) attention mechanism dynamically recalibrates the importance of different aspects of the input sequence, allowing the model to focus on critical time steps, particularly when abrupt wind events occur. In addition to wind speed, we introduce wind direction (WD) to characterize wind behavior due to its impact on the aerodynamic forces acting on trains. To maintain the periodicity of WD, we employ a triangular transform to predict the sine and cosine values of WD, improving the reliability of predictions. Massive experiments are conducted to evaluate the effectiveness of the proposed method based on real-world wind data collected from sensors along the Beijing–Baotou railway. Experimental results demonstrated that our model outperforms state-of-the-art solutions for WSF, achieving a mean-squared error (MSE) of 0.0393, a root-mean-squared error (RMSE) of 0.1982, and a coefficient of determination (

R^{2}

) of 99.59%. These experimental results validate the efficacy of our proposed model in enhancing the resilience and sustainability of railway infrastructure.Furthermore, the model can be utilized in other wind-sensitive sectors, such as highways, ports, and offshore wind operations. This will further promote the achievement of Sustainable Development Goal 9. Full article

(This article belongs to the Section Environmental Sustainability and Applications)

► Show Figures

Figure 1

18 pages, 70320 KiB

Open AccessArticle

RIS-UNet: A Multi-Level Hierarchical Framework for Liver Tumor Segmentation in CT Images

by Yuchai Wan, Lili Zhang and Murong Wang

Entropy 2025, 27(7), 735; https://doi.org/10.3390/e27070735 - 9 Jul 2025

Viewed by 420

Abstract

The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we [...] Read more.

The deep learning-based analysis of liver CT images is expected to provide assistance for clinicians in the diagnostic decision-making process. However, the accuracy of existing methods still falls short of clinical requirements and needs to be further improved. Therefore, in this work, we propose a novel multi-level hierarchical framework for liver tumor segmentation. In the first level, we integrate inter-slice spatial information by a 2.5D network to resolve the accuracy–efficiency trade-off inherent in conventional 2D/3D segmentation strategies for liver tumor segmentation. Then, the second level extracts the inner-slice global and local features for enhancing feature representation. We propose the Res-Inception-SE Block, which combines residual connections, multi-scale Inception modules, and squeeze-excitation attention to capture comprehensive global and local features. Furthermore, we design a hybrid loss function combining Binary Cross Entropy (BCE) and Dice loss to solve the category imbalance problem and accelerate convergence. Extensive experiments on the LiTS17 dataset demonstrate the effectiveness of our method on accuracy, efficiency, and visual results for liver tumor segmentation. Full article

(This article belongs to the Special Issue Cutting-Edge AI in Computational Bioinformatics)

► Show Figures

Figure 1

21 pages, 3406 KiB

Open AccessArticle

ResNet-SE-CBAM Siamese Networks for Few-Shot and Imbalanced PCB Defect Classification

by Chao-Hsiang Hsiao, Huan-Che Su, Yin-Tien Wang, Min-Jie Hsu and Chen-Chien Hsu

Sensors 2025, 25(13), 4233; https://doi.org/10.3390/s25134233 - 7 Jul 2025

Viewed by 568

Abstract

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product [...] Read more.

Defect detection in mass production lines often involves small and imbalanced datasets, necessitating the use of few-shot learning methods. Traditional deep learning-based approaches typically rely on large datasets, limiting their applicability in real-world scenarios. This study explores few-shot learning models for detecting product defects using limited data, enhancing model generalization and stability. Unlike previous deep learning models that require extensive datasets, our approach effectively performs defect detection with minimal data. We propose a Siamese network that integrates Residual blocks, Squeeze and Excitation blocks, and Convolution Block Attention Modules (ResNet-SE-CBAM Siamese network) for feature extraction, optimized through triplet loss for embedding learning. The ResNet-SE-CBAM Siamese network incorporates two primary features: attention mechanisms and metric learning. The recently developed attention mechanisms enhance the convolutional neural network operations and significantly improve feature extraction performance. Meanwhile, metric learning allows for the addition or removal of feature classes without the need to retrain the model, improving its applicability in industrial production lines with limited defect samples. To further improve training efficiency with imbalanced datasets, we introduce a sample selection method based on the Structural Similarity Index Measure (SSIM). Additionally, a high defect rate training strategy is utilized to reduce the False Negative Rate (FNR) and ensure no missed defect detections. At the classification stage, a K-Nearest Neighbor (KNN) classifier is employed to mitigate overfitting risks and enhance stability in few-shot conditions. The experimental results demonstrate that with a good-to-defect ratio of 20:40, the proposed system achieves a classification accuracy of 94% and an FNR of 2%. Furthermore, when the number of defective samples increases to 80, the system achieves zero false negatives (FNR = 0%). The proposed metric learning approach outperforms traditional deep learning models, such as parametric-based YOLO series models in defect detection, achieving higher accuracy and lower miss rates, highlighting its potential for high-reliability industrial deployment. Full article

(This article belongs to the Special Issue Artificial Intelligence and Sensor-Enhanced Fault Diagnosis for Industrial Application)

► Show Figures

Figure 1

20 pages, 1935 KiB

Open AccessArticle

Residual Attention Network with Atrous Spatial Pyramid Pooling for Soil Element Estimation in LUCAS Hyperspectral Data

by Yun Deng, Yuchen Cao, Shouxue Chen and Xiaohui Cheng

Appl. Sci. 2025, 15(13), 7457; https://doi.org/10.3390/app15137457 - 3 Jul 2025

Viewed by 292

Abstract

Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address [...] Read more.

Visible and near-infrared (Vis–NIR) spectroscopy enables the rapid prediction of soil properties but faces three limitations with conventional machine learning: information loss and overfitting from high-dimensional spectral features; inadequate modeling of nonlinear soil–spectra relationships; and failure to integrate multi-scale spatial features. To address these challenges, we propose ReSE-AP Net, a multi-scale attention residual network with spatial pyramid pooling. Built on convolutional residual blocks, the model incorporates a squeeze-and-excitation channel attention mechanism to recalibrate feature weights and an atrous spatial pyramid pooling (ASPP) module to extract multi-resolution spectral features. This architecture synergistically represents weak absorption peaks (400–1000 nm) and broad spectral bands (1000–2500 nm), overcoming single-scale modeling limitations. Validation on the LUCAS2009 dataset demonstrated that ReSE-AP Net outperformed conventional machine learning by improving the R² by 2.8–36.5% and reducing the RMSE by 14.2–69.2%. Compared with existing deep learning methods, it increased the R² by 0.4–25.5% for clay, silt, sand, organic carbon, calcium carbonate, and phosphorus predictions, and decreased the RMSE by 0.7–39.0%. Our contributions include statistical analysis of LUCAS2009 spectra, identification of conventional method limitations, development of the ReSE-AP Net model, ablation studies, and comprehensive comparisons with alternative approaches. Full article

(This article belongs to the Special Issue Advanced Agricultural Technologies: Monitoring, Modeling, and Machine Learning Techniques)

► Show Figures

Figure 1

20 pages, 2132 KiB

Open AccessArticle

Deep Learning with Dual-Channel Feature Fusion for Epileptic EEG Signal Classification

by Bingbing Yu, Mingliang Zuo and Li Sui

Eng 2025, 6(7), 150; https://doi.org/10.3390/eng6070150 - 2 Jul 2025

Viewed by 378

Abstract

Background: Electroencephalography (EEG) signals play a crucial role in diagnosing epilepsy by reflecting distinct patterns associated with normal brain activity, ictal (seizure) states, and interictal (between-seizure) periods. However, the manual classification of these patterns is labor-intensive, time-consuming, and depends heavily on specialized expertise. [...] Read more.

Background: Electroencephalography (EEG) signals play a crucial role in diagnosing epilepsy by reflecting distinct patterns associated with normal brain activity, ictal (seizure) states, and interictal (between-seizure) periods. However, the manual classification of these patterns is labor-intensive, time-consuming, and depends heavily on specialized expertise. While deep learning methods have shown promise, many current models suffer from limitations such as excessive complexity, high computational demands, and insufficient generalizability. Developing lightweight and accurate models for real-time epilepsy detection remains a key challenge. Methods: This study proposes a novel dual-channel deep learning model to classify epileptic EEG signals into three categories: normal, ictal, and interictal states. Channel 1 integrates a bidirectional long short-term memory (BiLSTM) network with a Squeeze-and-Excitation (SE) ResNet attention module to dynamically emphasize critical feature channels. Channel 2 employs a dual-branch convolutional neural network (CNN) to extract deeper and distinct features. The model’s performance was evaluated on the publicly available Bonn EEG dataset. Results: The proposed model achieved an outstanding accuracy of 98.57%. The dual-channel structure improved specificity to 99.43%, while the dual-branch CNN boosted sensitivity by 5.12%. Components such as SE-ResNet attention modules contributed 4.29% to the accuracy improvement, and BiLSTM further enhanced specificity by 1.62%. Ablation studies validated the significance of each module. Conclusions: By leveraging a lightweight design and attention-based mechanisms, the dual-channel model offers high diagnostic precision while maintaining computational efficiency. Its applicability to real-time automated diagnosis positions it as a promising tool for clinical deployment across diverse patient populations. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence Techniques for Disease Prediction, Diagnosis and Management)

► Show Figures

Figure 1

20 pages, 760 KiB

Open AccessArticle

Detecting AI-Generated Images Using a Hybrid ResNet-SE Attention Model

by Abhilash Reddy Gunukula, Himel Das Gupta and Victor S. Sheng

Appl. Sci. 2025, 15(13), 7421; https://doi.org/10.3390/app15137421 - 2 Jul 2025

Viewed by 398

Abstract

The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose [...] Read more.

The rapid advancements in generative artificial intelligence (AI), particularly through models like Generative Adversarial Networks (GANs) and diffusion-based architectures, have made it increasingly difficult to distinguish between real and synthetically generated images. While these technologies offer benefits in creative domains, they also pose serious risks in terms of misinformation, digital forgery, and identity manipulation. This paper presents a novel hybrid deep learning model for detecting AI-generated images by integrating the ResNet-50 architecture with Squeeze-and-Excitation (SE) attention blocks. The proposed SE-ResNet50 model enhances channel-wise feature recalibration and interpretability by integrating Squeeze-and-Excitation (SE) blocks into the ResNet-50 backbone, enabling dynamic emphasis on subtle generative artifacts such as unnatural textures and semantic inconsistencies, thereby improving classification fidelity. Experimental evaluation on the CIFAKE dataset demonstrates the model’s effectiveness, achieving a test accuracy of 96.12%, precision of 97.04%, recall of 88.94%, F1-score of 92.82%, and an AUC score of 0.9862. The model shows strong generalization, minimal overfitting, and superior performance compared with transformer-based models and standard architectures like ResNet-50, VGGNet, and DenseNet. These results confirm the hybrid model’s suitability for real-time and resource-constrained applications in media forensics, content authentication, and ethical AI governance. Full article

(This article belongs to the Special Issue Advanced Signal and Image Processing for Applied Engineering)

► Show Figures

Figure 1

24 pages, 1307 KiB

Open AccessArticle

A Self-Supervised Specific Emitter Identification Method Based on Contrastive Asymmetric Masked Learning

by Dong Wang, Yonghui Huang, Tianshu Cui and Yan Zhu

Sensors 2025, 25(13), 4023; https://doi.org/10.3390/s25134023 - 27 Jun 2025

Viewed by 298

Abstract

Specific emitter identification (SEI) is a core technology for wireless device security that plays a crucial role in protecting wireless communication systems from various security threats. However, current deep learning-based SEI methods heavily rely on large amounts of labeled data for supervised training, [...] Read more.

Specific emitter identification (SEI) is a core technology for wireless device security that plays a crucial role in protecting wireless communication systems from various security threats. However, current deep learning-based SEI methods heavily rely on large amounts of labeled data for supervised training, facing challenges in non-cooperative communication scenarios. To address these issues, this paper proposes a novel contrastive asymmetric masked learning-based SEI (CAML-SEI) method, effectively solving the problem of SEI under scarce labeled samples. The proposed method constructs an asymmetric auto-encoder architecture, comprising an encoder network based on channel squeeze-and-excitation residual blocks to capture radio frequency fingerprint (RFF) features embedded in signals, while employing a lightweight single-layer convolutional decoder for masked signal reconstruction. This design promotes the learning of fine-grained local feature representations. To further enhance feature discriminability, a learnable non-linear mapping is introduced to compress high-dimensional encoded features into a compact low-dimensional space, accompanied by a contrastive loss function that simultaneously achieves feature aggregation of positive samples and feature separation of negative samples. Finally, the network is jointly optimized by combining signal reconstruction and feature contrast tasks. Experiments conducted on real-world ADS-B and Wi-Fi datasets demonstrate that the proposed method effectively learns generalized RFF features, and the results show superior performance compared with other SEI methods. Full article

(This article belongs to the Section Communications)

► Show Figures

Figure 1

Search Results (374)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (374)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI