MDPI - Publisher of Open Access Journals

23 pages, 6440 KiB

Open AccessArticle

A Gravity Data Denoising Method Based on Multi-Scale Attention Mechanism and Physical Constraints Using U-Net

by Bing Liu, Houpu Li, Shaofeng Bian, Chaoliang Zhang, Bing Ji and Yujie Zhang

Appl. Sci. 2025, 15(14), 7956; https://doi.org/10.3390/app15147956 (registering DOI) - 17 Jul 2025

Gravity and gravity gradient data serve as fundamental inputs for geophysical resource exploration and geological structure analysis. However, traditional denoising methods—including wavelet transforms, moving averages, and low-pass filtering—exhibit signal loss and limited adaptability under complex, non-stationary noise conditions. To address these challenges, this [...] Read more.

Gravity and gravity gradient data serve as fundamental inputs for geophysical resource exploration and geological structure analysis. However, traditional denoising methods—including wavelet transforms, moving averages, and low-pass filtering—exhibit signal loss and limited adaptability under complex, non-stationary noise conditions. To address these challenges, this study proposes an improved U-Net deep learning framework that integrates multi-scale feature extraction and attention mechanisms. Furthermore, a Laplace consistency constraint is introduced into the loss function to enhance denoising performance and physical interpretability. Notably, the datasets used in this study are generated by the authors, involving simulations of subsurface prism distributions with realistic density perturbations (±20% of typical rock densities) and the addition of controlled Gaussian noise (5%, 10%, 15%, and 30%) to simulate field-like conditions, ensuring the diversity and physical relevance of training samples. Experimental validation on these synthetic datasets and real field datasets demonstrates the superiority of the proposed method over conventional techniques. For noise levels of 5%, 10%, 15%, and 30% in test sets, the improved U-Net achieves Peak Signal-to-Noise Ratios (PSNR) of 59.13 dB, 52.03 dB, 48.62 dB, and 48.81 dB, respectively, outperforming wavelet transforms, moving averages, and low-pass filtering by 10–30 dB. In multi-component gravity gradient denoising, our method excels in detail preservation and noise suppression, improving Structural Similarity Index (SSIM) by 15–25%. Field data tests further confirm enhanced identification of key geological anomalies and overall data quality improvement. In summary, the improved U-Net not only delivers quantitative advancements in gravity data denoising but also provides a novel approach for high-precision geophysical data preprocessing. Full article

(This article belongs to the Special Issue Applications of Machine Learning in Earth Sciences—2nd Edition)

► Show Figures

Figure 1

14 pages, 890 KiB

Open AccessArticle

Radiomics Signature of Aging Myocardium in Cardiac Photon-Counting Computed Tomography

by Alexander Hertel, Mustafa Kuru, Johann S. Rink, Florian Haag, Abhinay Vellala, Theano Papavassiliu, Matthias F. Froelich, Stefan O. Schoenberg and Isabelle Ayx

Diagnostics 2025, 15(14), 1796; https://doi.org/10.3390/diagnostics15141796 (registering DOI) - 16 Jul 2025

Abstract

Background: Cardiovascular diseases are the leading cause of global mortality, with 80% of coronary heart disease in patients over 65. Understanding aging cardiovascular structures is crucial. Photon-counting computed tomography (PCCT) offers improved spatial and temporal resolution and better signal-to-noise ratio, enabling texture analysis [...] Read more.

Background: Cardiovascular diseases are the leading cause of global mortality, with 80% of coronary heart disease in patients over 65. Understanding aging cardiovascular structures is crucial. Photon-counting computed tomography (PCCT) offers improved spatial and temporal resolution and better signal-to-noise ratio, enabling texture analysis in clinical routines. Detecting structural changes in aging left-ventricular myocardium may help predict cardiovascular risk. Methods: In this retrospective, single-center, IRB-approved study, 90 patients underwent ECG-gated contrast-enhanced cardiac CT using dual-source PCCT (NAEOTOM Alpha, Siemens). Patients were divided into two age groups (50–60 years and 70–80 years). The left ventricular myocardium was segmented semi-automatically, and radiomics features were extracted using pyradiomics to compare myocardial texture features. Epicardial adipose tissue (EAT) density, thickness, and other clinical parameters were recorded. Statistical analysis was conducted with R and a Python-based random forest classifier. Results: The study assessed 90 patients (50–60 years, n = 54, and 70–80 years, n = 36) with a mean age of 63.6 years. No significant differences were found in mean Agatston score, gender distribution, or conditions like hypertension, diabetes, hypercholesterolemia, or nicotine abuse. EAT measurements showed no significant differences. The Random Forest Classifier achieved a training accuracy of 0.95 and a test accuracy of 0.74 for age group differentiation. Wavelet-HLH_glszm_GrayLevelNonUniformity was a key differentiator. Conclusions: Radiomics texture features of the left ventricular myocardium outperformed conventional parameters like EAT density and thickness in differentiating age groups, offering a potential imaging biomarker for myocardial aging. Radiomics analysis of left ventricular myocardium offers a unique opportunity to visualize changes in myocardial texture during aging and could serve as a cardiac risk predictor. Full article

(This article belongs to the Section Machine Learning and Artificial Intelligence in Diagnostics)

14 pages, 1563 KiB

Open AccessArticle

High-Resolution Time-Frequency Feature Selection and EEG Augmented Deep Learning for Motor Imagery Recognition

by Mouna Bouchane, Wei Guo and Shuojin Yang

Electronics 2025, 14(14), 2827; https://doi.org/10.3390/electronics14142827 - 14 Jul 2025

Viewed by 139

Abstract

Motor Imagery (MI) based Brain Computer Interfaces (BCIs) have promising applications in neurorehabilitation for individuals who have lost mobility and control over parts of their body due to brain injuries, such as stroke patients. Accurately classifying MI tasks is essential for effective BCI [...] Read more.

Motor Imagery (MI) based Brain Computer Interfaces (BCIs) have promising applications in neurorehabilitation for individuals who have lost mobility and control over parts of their body due to brain injuries, such as stroke patients. Accurately classifying MI tasks is essential for effective BCI performance, but this task remains challenging due to the complex and non-stationary nature of EEG signals. This study aims to improve the classification of left and right-hand MI tasks by utilizing high-resolution time-frequency features extracted from EEG signals, enhanced with deep learning-based data augmentation techniques. We propose a novel deep learning framework named the Generalized Wavelet Transform-based Deep Convolutional Network (GDC-Net), which integrates multiple components. First, EEG signals recorded from the C3, C4, and Cz channels are transformed into detailed time-frequency representations using the Generalized Morse Wavelet Transform (GMWT). The selected features are then expanded using a Deep Convolutional Generative Adversarial Network (DCGAN) to generate additional synthetic data and address data scarcity. Finally, the augmented feature maps data are subsequently fed into a hybrid CNN-LSTM architecture, enabling both spatial and temporal feature learning for improved classification. The proposed approach is evaluated on the BCI Competition IV dataset 2b. Experimental results showed that the mean classification accuracy and Kappa value are 89.24% and 0.784, respectively, making them the highest compared to the state-of-the-art algorithms. The integration of GMWT and DCGAN significantly enhances feature quality and model generalization, thereby improving classification performance. These findings demonstrate that GDC-Net delivers superior MI classification performance by effectively capturing high-resolution time-frequency dynamics and enhancing data diversity. This approach holds strong potential for advancing MI-based BCI applications, especially in assistive and rehabilitation technologies. Full article

(This article belongs to the Section Computer Science & Engineering)

► Show Figures

Figure 1

27 pages, 8538 KiB

Open AccessArticle

Optimizing Hyperspectral Desertification Monitoring Through Metaheuristic-Enhanced Wavelet Packet Noise Reduction and Feature Band Selection

by Weichao Liu, Jiapeng Xiao, Rongyuan Liu, Yan Liu, Yunzhu Tao, Tian Zhang, Fuping Gan, Ping Zhou, Yuanbiao Dong and Qiang Zhou

Remote Sens. 2025, 17(14), 2444; https://doi.org/10.3390/rs17142444 - 14 Jul 2025

Viewed by 103

Abstract

Land desertification represents a significant and sensitive global ecological issue. In the Inner Mongolia region of China, soil desertification and salinization are widespread, resulting from the combined effects of extreme drought conditions and human activities. Using Gaofen 5B AHSI imagery as our data [...] Read more.

Land desertification represents a significant and sensitive global ecological issue. In the Inner Mongolia region of China, soil desertification and salinization are widespread, resulting from the combined effects of extreme drought conditions and human activities. Using Gaofen 5B AHSI imagery as our data source, we collected spectral data for seven distinct land cover types: lush vegetation, yellow sand, white sand, saline soil, saline shell, saline soil with saline vegetation, and sandy soil. We applied Particle Swarm Optimization (PSO) to fine-tune the Wavelet Packet (WP) decomposition levels, thresholds, and wavelet basis function, ensuring optimal spectral decomposition and reconstruction. Subsequently, PSO was deployed to optimize key hyperparameters of the Random Forest algorithm and compare its performance with the ResNet-Transformer model. Our results indicate that PSO effectively automates the search for optimal WP decomposition parameters, preserving essential spectral information while efficiently reducing high-frequency spectral noise. The Genetic Algorithm (GA) was also found to be effective in extracting feature bands relevant to land desertification, which enhances the classification accuracy of the model. Among all the models, integrating wavelet packet denoising, genetic algorithm feature selection, the first-order differential (FD), and the hybrid architecture of the ResNet-Transformer, the WP-GA-FD-ResNet-Transformer model achieved the highest accuracy in extracting soil sandification and salinization, with Kappa coefficients and validation set accuracies of 0.9746 and 97.82%, respectively. This study contributes to the field by advancing hyperspectral desertification monitoring techniques and suggests that the approach could be valuable for broader ecological conservation and land management efforts. Full article

(This article belongs to the Section Ecological Remote Sensing)

► Show Figures

Figure 1

26 pages, 7701 KiB

Open AccessArticle

YOLO-StarLS: A Ship Detection Algorithm Based on Wavelet Transform and Multi-Scale Feature Extraction for Complex Environments

by Yihan Wang, Shuang Zhang, Jianhao Xu, Zhenwen Cheng and Gang Du

Symmetry 2025, 17(7), 1116; https://doi.org/10.3390/sym17071116 - 11 Jul 2025

Viewed by 165

Abstract

Ship detection in complex environments presents challenges such as sea surface reflections, wave interference, variations in illumination, and a range of target scales. The interaction between symmetric ship structures and wave patterns challenges conventional algorithms, particularly in maritime wireless networks. This study presents [...] Read more.

Ship detection in complex environments presents challenges such as sea surface reflections, wave interference, variations in illumination, and a range of target scales. The interaction between symmetric ship structures and wave patterns challenges conventional algorithms, particularly in maritime wireless networks. This study presents YOLO-StarLS (You Only Look Once with Star-topology Lightweight Ship detection), a detection framework leveraging wavelet transforms and multi-scale feature extraction through three core modules. We developed a Wavelet Multi-scale Feature Extraction Network (WMFEN) utilizing adaptive Haar wavelet decomposition with star-topology extraction to preserve multi-frequency information while minimizing detail loss. We introduced a Cross-axis Spatial Attention Refinement module (CSAR), which integrates star structures with cross-axis attention mechanisms to enhance spatial perception. We constructed an Efficient Detail-Preserving Detection head (EDPD) combining differential and shared convolutions to enhance edge detection while reducing computational complexity. Evaluation on the SeaShips dataset demonstrated YOLO-StarLS achieved superior performance for both mAP50 and mAP50–95 metrics, improving by 2.21% and 2.42% over the baseline YOLO11. The approach achieved significant efficiency, with a 36% reduction in the number of parameters to 1.67 M, a 34% decrease in complexity to 4.3 GFLOPs, and an inference speed of 162.0 FPS. Comparative analysis against eight algorithms confirmed the superiority in symmetric target detection. This work enhances real-time ship detection and provides foundations for maritime wireless surveillance networks. Full article

(This article belongs to the Section Computer)

► Show Figures

Figure 1

21 pages, 7528 KiB

Open AccessArticle

A Fine-Tuning Method via Adaptive Symmetric Fusion and Multi-Graph Aggregation for Human Pose Estimation

by Yinliang Shi, Zhaonian Liu, Bin Jiang, Tianqi Dai and Yuanfeng Lian

Symmetry 2025, 17(7), 1098; https://doi.org/10.3390/sym17071098 - 9 Jul 2025

Viewed by 234

Abstract

Human Pose Estimation (HPE) aims to accurately locate the positions of human key points in images or videos. However, the performance of HPE is often significantly reduced in practical application scenarios due to environmental interference. To address this challenge, we propose a ladder [...] Read more.

Human Pose Estimation (HPE) aims to accurately locate the positions of human key points in images or videos. However, the performance of HPE is often significantly reduced in practical application scenarios due to environmental interference. To address this challenge, we propose a ladder side-tuning method for the Vision Transformer (ViT) pre-trained model based on multi-path feature fusion to improve the accuracy of HPE in highly interfering environments. First, we extract the global features, frequency features and multi-scale spatial features through the ViT pre-trained model, the discrete wavelet convolutional network and the atrous spatial pyramid pooling network (ASPP). By comprehensively capturing the information of the human body and the environment, the ability of the model to analyze local details, textures, and spatial information is enhanced. In order to efficiently fuse these features, we devise an adaptive symmetric feature fusion strategy, which dynamically adjusts the intensity of feature fusion according to the similarity among features to achieve the optimal fusion effect. Finally, a multi-graph feature aggregation method is developed. We construct graph structures of different features and deeply explore the subtle differences among the features based on the dual fusion mechanism of points and edges to ensure the information integrity. The experimental results demonstrate that our method achieves 4.3% and 4.2% improvements in the AP metric on the MS COCO dataset and a custom high-interference dataset, respectively, compared with the HRNet. This highlights its superiority for human pose estimation tasks in both general and interfering environments. Full article

(This article belongs to the Special Issue Symmetry and Asymmetry in Computer Vision and Graphics)

► Show Figures

Figure 1

17 pages, 7786 KiB

Open AccessArticle

Video Coding Based on Ladder Subband Recovery and ResGroup Module

by Libo Wei, Aolin Zhang, Lei Liu, Jun Wang and Shuai Wang

Entropy 2025, 27(7), 734; https://doi.org/10.3390/e27070734 - 8 Jul 2025

Viewed by 240

Abstract

With the rapid development of video encoding technology in the field of computer vision, the demand for tasks such as video frame reconstruction, denoising, and super-resolution has been continuously increasing. However, traditional video encoding methods typically focus on extracting spatial or temporal domain [...] Read more.

With the rapid development of video encoding technology in the field of computer vision, the demand for tasks such as video frame reconstruction, denoising, and super-resolution has been continuously increasing. However, traditional video encoding methods typically focus on extracting spatial or temporal domain information, often facing challenges of insufficient accuracy and information loss when reconstructing high-frequency details, edges, and textures of images. To address this issue, this paper proposes an innovative LadderConv framework, which combines discrete wavelet transform (DWT) with spatial and channel attention mechanisms. By progressively recovering wavelet subbands, it effectively enhances the video frame encoding quality. Specifically, the LadderConv framework adopts a stepwise recovery approach for wavelet subbands, first processing high-frequency detail subbands with relatively less information, then enhancing the interaction between these subbands, and ultimately synthesizing a high-quality reconstructed image through inverse wavelet transform. Moreover, the framework introduces spatial and channel attention mechanisms, which further strengthen the focus on key regions and channel features, leading to notable improvements in detail restoration and image reconstruction accuracy. To optimize the performance of the LadderConv framework, particularly in detail recovery and high-frequency information extraction tasks, this paper designs an innovative ResGroup module. By using multi-layer convolution operations along with feature map compression and recovery, the ResGroup module enhances the network’s expressive capability and effectively reduces computational complexity. The ResGroup module captures multi-level features from low level to high level and retains rich feature information through residual connections, thus improving the overall reconstruction performance of the model. In experiments, the combination of the LadderConv framework and the ResGroup module demonstrates superior performance in video frame reconstruction tasks, particularly in recovering high-frequency information, image clarity, and detail representation. Full article

(This article belongs to the Special Issue Rethinking Representation Learning in the Age of Large Models)

► Show Figures

Figure 1

23 pages, 6001 KiB

Open AccessArticle

Quantification of Flavonoid Contents in Holy Basil Using Hyperspectral Imaging and Deep Learning Approaches

by Apichat Suratanee, Panita Chutimanukul and Kitiporn Plaimas

Appl. Sci. 2025, 15(13), 7582; https://doi.org/10.3390/app15137582 - 6 Jul 2025

Viewed by 281

Abstract

Holy basil (Ocimum tenuiflorum L.) is a medicinal herb rich in bioactive flavonoids with therapeutic properties. Traditional quantification methods rely on time-consuming and destructive extraction processes, whereas hyperspectral imaging provides a rapid, non-destructive alternative by analysing spectral signatures. However, effectively linking hyperspectral [...] Read more.

Holy basil (Ocimum tenuiflorum L.) is a medicinal herb rich in bioactive flavonoids with therapeutic properties. Traditional quantification methods rely on time-consuming and destructive extraction processes, whereas hyperspectral imaging provides a rapid, non-destructive alternative by analysing spectral signatures. However, effectively linking hyperspectral data to flavonoid levels remains a challenge for developing early detection tools before harvest. This study integrates deep learning with hyperspectral imaging to quantify flavonoid contents in 113 samples from 26 Thai holy basil cultivars collected across diverse regions of Thailand. Two deep learning architectures, ResNet1D and CNN1D, were evaluated in combination with feature extraction techniques, including wavelet transformation and Gabor-like filtering. ResNet1D with wavelet transformation achieved optimal performance, yielding an area under the receiver operating characteristic curve (AUC) of 0.8246 and an accuracy of 0.7702 for flavonoid content classification. Cross-validation demonstrated the model’s robust predictive capability in identifying antioxidant-rich samples. Samples with the highest predicted flavonoid content were identified, and cultivars exhibiting elevated levels of both flavonoids and phenolics were highlighted across various regions of Thailand. These findings demonstrate the predictive capability of hyperspectral data combined with deep learning for phytochemical assessment. This approach offers a valuable tool for non-destructive quality evaluation and supports cultivar selection for higher phytochemical content in breeding programs and agricultural applications. Full article

(This article belongs to the Special Issue Emerging Analytical Techniques in Food Industry and Agricultural Products)

► Show Figures

Figure 1

27 pages, 19258 KiB

Open AccessArticle

A Lightweight Multi-Frequency Feature Fusion Network with Efficient Attention for Breast Tumor Classification in Pathology Images

by Hailong Chen, Qingqing Song and Guantong Chen

Information 2025, 16(7), 579; https://doi.org/10.3390/info16070579 - 6 Jul 2025

Viewed by 308

Abstract

The intricate and complex tumor cell morphology in breast pathology images is a key factor for tumor classification. This paper proposes a lightweight breast tumor classification model with multi-frequency feature fusion (LMFM) to tackle the problem of inadequate feature extraction and poor classification [...] Read more.

The intricate and complex tumor cell morphology in breast pathology images is a key factor for tumor classification. This paper proposes a lightweight breast tumor classification model with multi-frequency feature fusion (LMFM) to tackle the problem of inadequate feature extraction and poor classification performance. The LMFM utilizes wavelet transform (WT) for multi-frequency feature fusion, integrating high-frequency (HF) tumor details with high-level semantic features to enhance feature representation. The network’s ability to extract irregular tumor characteristics is further reinforced by dynamic adaptive deformable convolution (DADC). The introduction of the token-based Region Focus Module (TRFM) reduces interference from irrelevant background information. At the same time, the incorporation of a linear attention (LA) mechanism lowers the model’s computational complexity and further enhances its global feature extraction capability. The experimental results demonstrate that the proposed model achieves classification accuracies of 98.23% and 97.81% on the BreaKHis and BACH datasets, with only 9.66 M parameters. Full article

(This article belongs to the Section Biomedical Information and Health)

► Show Figures

Figure 1

18 pages, 1709 KiB

Open AccessArticle

Toward New Assessment in Sarcoma Identification and Grading Using Artificial Intelligence Techniques

by Arnar Evgení Gunnarsson, Simona Correra, Carol Teixidó Sánchez, Marco Recenti, Halldór Jónsson and Paolo Gargiulo

Diagnostics 2025, 15(13), 1694; https://doi.org/10.3390/diagnostics15131694 - 2 Jul 2025

Viewed by 402

Abstract

Background/Objectives: Sarcomas are a rare and heterogeneous group of malignant tumors, which makes early detection and grading particularly challenging. Diagnosis traditionally relies on expert visual interpretation of histopathological biopsies and radiological imaging, processes that can be time-consuming, subjective and susceptible to inter-observer variability. [...] Read more.

Background/Objectives: Sarcomas are a rare and heterogeneous group of malignant tumors, which makes early detection and grading particularly challenging. Diagnosis traditionally relies on expert visual interpretation of histopathological biopsies and radiological imaging, processes that can be time-consuming, subjective and susceptible to inter-observer variability. Methods: In this study, we aim to explore the potential of artificial intelligence (AI), specifically radiomics and machine learning (ML), to support sarcoma diagnosis and grading based on MRI scans. We extracted quantitative features from both raw and wavelet-transformed images, including first-order statistics and texture descriptors such as the gray-level co-occurrence matrix (GLCM), gray-level size-zone matrix (GLSZM), gray-level run-length matrix (GLRLM), and neighboring gray tone difference matrix (NGTDM). These features were used to train ML models for two tasks: binary classification of healthy vs. pathological tissue and prognostic grading of sarcomas based on the French FNCLCC system. Results: The binary classification achieved an accuracy of 76.02% using a combination of features from both raw and transformed images. FNCLCC grade classification reached an accuracy of 57.6% under the same conditions. Specifically, wavelet transforms of raw images boosted classification accuracy, hinting at the large potential that image transforms can add to these tasks. Conclusions: Our findings highlight the value of combining multiple radiomic features and demonstrate that wavelet transforms significantly enhance classification performance. By outlining the potential of AI-based approaches in sarcoma diagnostics, this work seeks to promote the development of decision support systems that could assist clinicians. Full article

(This article belongs to the Special Issue Artificial Intelligence in Clinical Decision Support—2nd Edition)

► Show Figures

Figure 1

24 pages, 24510 KiB

Open AccessArticle

Application of Graph-Theoretic Methods Using ERP Components and Wavelet Coherence on Emotional and Cognitive EEG Data

by Sencer Melih Deniz, Ahmet Ademoglu, Adil Deniz Duru and Tamer Demiralp

Brain Sci. 2025, 15(7), 714; https://doi.org/10.3390/brainsci15070714 - 2 Jul 2025

Viewed by 475

Abstract

Background/Objectives: Emotion and cognition, two essential components of human mental processes, have traditionally been studied independently. The exploration of emotion and cognition is fundamental for gaining an understanding of human mental functioning. Despite the availability of various methods to measure and evaluate emotional [...] Read more.

Background/Objectives: Emotion and cognition, two essential components of human mental processes, have traditionally been studied independently. The exploration of emotion and cognition is fundamental for gaining an understanding of human mental functioning. Despite the availability of various methods to measure and evaluate emotional states and cognitive processes, physiological measurements are considered to be one of the most reliable methods due to their objective approach. In particular, electroencephalography (EEG) provides unique insight into emotional and cognitive activity through the analysis of event-related potentials (ERPs). In this study, we discriminated pleasant/unpleasant emotional moods and low/high cognitive states using graph-theoretic features extracted from spatio-temporal components. Methods: Emotional data were collected at the Physiology Department of Istanbul Medical Faculty at Istanbul University, whereas cognitive data were obtained from the DepositOnce repository of Technische Universität Berlin. Wavelet coherence values for the N100, N200, and P300 single-trial ERP components in the delta, theta, alpha, and beta frequency bands were investigated individually. Then, graph-theoretic analyses were performed using wavelet coherence-based connectivity maps. Global and local graph metrics such as energy efficiency, strength, transitivity, characteristic path length, and clustering coefficient were used as features for classification using support vector machines (SVMs), k-nearest neighbor(K-NN), and linear discriminant analysis (LDA). Results: The results show that both pleasant/unpleasant emotional moods and low/high cognitive states can be discriminated, with average accuracies of up to 92% and 89%, respectively. Conclusions: Graph-theoretic metrics based on wavelet coherence of ERP components in the delta band with the SVM algorithm allow for the discrimination of emotional and cognitive states with high accuracy. Full article

(This article belongs to the Section Cognitive, Social and Affective Neuroscience)

► Show Figures

Figure 1

20 pages, 7167 KiB

Open AccessArticle

FM-Net: Frequency-Aware Masked-Attention Network for Infrared Small Target Detection

by Yongxian Liu, Zaiping Lin, Boyang Li, Ting Liu and Wei An

Remote Sens. 2025, 17(13), 2264; https://doi.org/10.3390/rs17132264 - 1 Jul 2025

Viewed by 271

Abstract

Infrared small target detection (IRSTD) aims to locate and separate targets from complex backgrounds. The challenges in IRSTD primarily come from extremely sparse target features and strong background clutter interference. However, existing methods typically perform discrimination directly on the features extracted by deep [...] Read more.

Infrared small target detection (IRSTD) aims to locate and separate targets from complex backgrounds. The challenges in IRSTD primarily come from extremely sparse target features and strong background clutter interference. However, existing methods typically perform discrimination directly on the features extracted by deep networks, neglecting the distinct characteristics of weak and small targets in the frequency domain, thereby limiting the improvement of detection capability. In this paper, we propose a frequency-aware masked-attention network (FM-Net) that leverages multi-scale frequency clues to assist in representing global context and suppressing noise interference. Specifically, we design the wavelet residual block (WRB) to extract multi-scale spatial and frequency features, which introduces a wavelet pyramid as the intermediate layer of the residual block. Then, to perceive global information on the long-range skip connections, a frequency-modulation masked-attention module (FMM) is used to interact with multi-layer features from the encoder. FMM contains two crucial elements: (a) a mask attention (MA) mechanism for injecting broad contextual feature efficiently to promote full-level semantic correlation and focus on salient regions, and (b) a channel-wise frequency modulation module (CFM) for enhancing the most informative frequency components and suppressing useless ones. Extensive experiments on three benchmark datasets (e.g., SIRST, NUDT-SIRST, IRSTD-1k) demonstrate that FM-Net achieves superior detection performance. Full article

(This article belongs to the Special Issue Advanced Artificial Intelligence and Deep Learning for Remote Sensing (3rd Edition))

► Show Figures

Graphical abstract

31 pages, 8947 KiB

Open AccessArticle

Research on Super-Resolution Reconstruction of Coarse Aggregate Particle Images for Earth–Rock Dam Construction Based on Real-ESRGAN

by Shuangping Li, Lin Gao, Bin Zhang, Zuqiang Liu, Xin Zhang, Linjie Guan and Junxing Zheng

Sensors 2025, 25(13), 4084; https://doi.org/10.3390/s25134084 - 30 Jun 2025

Viewed by 264

Abstract

This paper investigates the super-resolution reconstruction technology of coarse granular particle images for embankment construction in earth/rock dams based on Real-ESRGAN, aiming to improve the quality of low-resolution particle images and enhance the accuracy of particle shape analysis. The paper begins with a [...] Read more.

This paper investigates the super-resolution reconstruction technology of coarse granular particle images for embankment construction in earth/rock dams based on Real-ESRGAN, aiming to improve the quality of low-resolution particle images and enhance the accuracy of particle shape analysis. The paper begins with a review of traditional image super-resolution methods, introducing Generative Adversarial Networks (GAN) and Real-ESRGAN, which effectively enhance image detail recovery through perceptual loss and adversarial training. To improve the generalization ability of the super-resolution model, the study expands the morphological database of earth/rock dam particles by employing a multi-modal data augmentation strategy, covering a variety of particle shapes. The paper utilizes a dual-stage degradation model to simulate the image degradation process in real-world environments, providing a diverse set of degraded images for training the super-resolution reconstruction model. Through wavelet transform methods, the paper analyzes the edge and texture features of particle images, further improving the precision of particle shape feature extraction. Experimental results show that Real-ESRGAN outperforms other traditional super-resolution algorithms in terms of edge clarity, detail recovery, and the preservation of morphological features of particle images, particularly under low-resolution conditions, with significant improvement in image reconstruction. In conclusion, Real-ESRGAN demonstrates excellent performance in the super-resolution reconstruction of coarse granular particle images for embankment construction in earth/rock dams. It can effectively restore the details and morphological features of particle images, providing more accurate technical support for particle shape analysis in civil engineering. Full article

(This article belongs to the Topic 3D Computer Vision and Smart Building and City, 3rd Edition)

► Show Figures

Figure 1

12 pages, 1152 KiB

Open AccessArticle

Machine Learning Models Derived from [¹⁸F]FDG PET/CT for the Prediction of Recurrence in Patients with Thymomas

by Angelo Castello, Luigi Manco, Margherita Cattaneo, Riccardo Orlandi, Lorenzo Rosso, Giorgio Alberto Croci, Luigia Florimonte, Giovanni Scribano, Alessandro Turra, Stefano Ferrero, Mario Nosotti, Gianpaolo Carrafiello, Massimo Castellani and Paolo Mendogni

Bioengineering 2025, 12(7), 721; https://doi.org/10.3390/bioengineering12070721 - 30 Jun 2025

Viewed by 243

Abstract

Background/Objectives: This study aimed to develop machine learning (ML) models to predict recurrence in thymoma patients using conventional and radiomic signatures extracted from preoperative [¹⁸F]FDG PET/CT. Methods: A total of 50 patients (25 males, 25 females; mean age 63.3 ± 14.2 [...] Read more.

Background/Objectives: This study aimed to develop machine learning (ML) models to predict recurrence in thymoma patients using conventional and radiomic signatures extracted from preoperative [¹⁸F]FDG PET/CT. Methods: A total of 50 patients (25 males, 25 females; mean age 63.3 ± 14.2 years) who underwent thymectomy and preoperative [¹⁸F]FDG PET/CT between 2012 and 2022 were retrospectively analyzed. Radiomic analysis was performed using free-from-recurrence (FFR) status as a reference. Clinico-metabolic PET parameters were collected, and thymoma lesions were manually segmented on [¹⁸F]FDG PET/CT. A total of 856 radiomic features (RFts) were extracted from PET and CT datasets following IBSI guidelines, and robust RFts were selected. The dataset was split into training (70%) and validation (30%) sets. Two ML models (PET- and CT-based, respectively), each with three classifiers—Random Forest (RF), Support-Vector-Machine, and Tree—were trained and internally validated using RFts and clinico-metabolic signatures. Results: A total of 50 ROIs were selected and segmented. FFR was observed in 84% of our cohort. Forty-three robust RFts were selected from the CT dataset and 16 from the PET dataset, predominantly wavelet-based RFts. Additionally, three metabolic PET parameters were selected and included in the PET Model. Both the CT and PET models successfully discriminated against FFR after surgery, with the CT Model slightly outperforming the PET Model across different classifiers. The performance metrics of the RF classifier for the CT and PET models were AUC = 0.970/0.949, CA = 0.880/0.840, Precision = 0.884/0.842, Recall = 0.880/0.846, Specificity = 0.887/0.839, Sensitivity = 0.920/0.844, TP = 81.8%/83.3%, and TN = 92.9%/84.6%, respectively. Conclusions: ML-models trained on PET/CT radiomic features show promising results for predicting recurrence in patients with thymomas, which could be potentially applied in clinical practice for a better personalized treatment strategy. Full article

(This article belongs to the Special Issue Application of Artificial Intelligence in Oncologic PET Imaging)

► Show Figures

Figure 1

26 pages, 5110 KiB

Open AccessArticle

Rolling Based on Multi-Source Time–Frequency Feature Fusion with a Wavelet-Convolution, Channel-Attention-Residual Network-Bearing Fault Diagnosis Method

by Tongshuhao Feng, Zhuoran Wang, Lipeng Qiu, Hongkun Li and Zhen Wang

Sensors 2025, 25(13), 4091; https://doi.org/10.3390/s25134091 - 30 Jun 2025

Viewed by 275

Abstract

As a core component of rotating machinery, the condition of rolling bearings is directly related to the reliability and safety of equipment operation; therefore, the accurate and reliable monitoring of bearing operating status is critical. However, when dealing with non-stationary and noisy vibration [...] Read more.

As a core component of rotating machinery, the condition of rolling bearings is directly related to the reliability and safety of equipment operation; therefore, the accurate and reliable monitoring of bearing operating status is critical. However, when dealing with non-stationary and noisy vibration signals, traditional fault diagnosis methods are often constrained by limited feature characterization from single time–frequency analysis and inadequate feature extraction capabilities. To address this issue, this study proposes a lightweight fault diagnosis model (WaveCAResNet) enhanced with multi-source time–frequency features. By fusing complementary time–frequency features derived from continuous wavelet transform, short-time Fourier transform, Hilbert–Huang transform, and Wigner–Ville distribution, the capability to characterize complex fault patterns is significantly improved. Meanwhile, an efficient and lightweight deep learning model (WaveCAResNet) is constructed based on residual networks by integrating multi-scale analysis via a wavelet convolutional layer (WTConv) with the dynamic feature optimization properties of channel-attention-weighted residuals (CAWRs) and the efficient temporal modeling capabilities of weighted residual efficient multi-scale attention (WREMA). Experimental validation indicates that the proposed method achieves higher diagnostic accuracy and robustness than existing mainstream models on typical bearing datasets, and the classification performance of the newly proposed model exceeds that of state-of-the-art bearing fault diagnostic models on the same dataset, even under noisy conditions. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

► Show Figures

Figure 1

Search Results (1,216)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,216)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI