MDPI - Publisher of Open Access Journals

18 pages, 3730 KB

Open AccessArticle

Breast Cancer Diagnosis Method Based on Phase Congruency and Dual-Branch Feature Modeling

by Yurui Shi, Enlin Wang, Mengda Zhao and Jianxin Zhang

Appl. Sci. 2026, 16(11), 5280; https://doi.org/10.3390/app16115280 - 25 May 2026

Breast cancer histopathological image classification remains a challenging task because reliable diagnosis depends on both fine-grained local lesion characteristics and multi-scale global tissue structures. However, current deep learning approaches often face challenges in effectively integrating these complementary cues, particularly in the presence of [...] Read more.

Breast cancer histopathological image classification remains a challenging task because reliable diagnosis depends on both fine-grained local lesion characteristics and multi-scale global tissue structures. However, current deep learning approaches often face challenges in effectively integrating these complementary cues, particularly in the presence of staining variations, ambiguous lesion boundaries, and limited annotated datasets. To address these challenges, we propose a novel method called UNI-Phase-Dual Network (UPDNet). This approach enhances the detection of stable lesion boundaries and subtle patterns by incorporating phase congruency, while combining it with global tissue information using the UNI foundation model. The method utilizes two branches to process features from different perspectives, one focusing on fine details and the other capturing broader context. Additionally, we apply a fine-tuning strategy that improves generalization and reduces overfitting in scenarios with small datasets. Experiments on three widely used breast cancer datasets, BRACS, BreakHis, and BACH, demonstrate that UPDNet significantly outperforms existing methods. Specifically, on the 7-class BRACS task, UPDNet achieves 68.58% accuracy, which is a 2.21% improvement over previous methods, and an increase of 1.48% in the weighted F1 score. These results demonstrate the strong potential of UPDNet in breast cancer histopathological image classification. Full article

► Show Figures

Figure 1

42 pages, 5367 KB

Open AccessArticle

Wavelet-Guided Mamba-Attention Network for Boundary-Aware Colorectal Polyp Segmentation

by Xin Liu, Nor Ashidi Mat Isa, Chao Chen, Hanxu Liu, Chao Wang and Fajin Lv

Mach. Learn. Knowl. Extr. 2026, 8(6), 142; https://doi.org/10.3390/make8060142 - 23 May 2026

Abstract

Colorectal cancer is the third most commonly diagnosed cancer worldwide, and early detection of polyps via colonoscopy is essential for improving patient survival. However, automatic polyp segmentation faces three key challenges: balancing global context with local detail, delineating ambiguous boundaries under low contrast, [...] Read more.

Colorectal cancer is the third most commonly diagnosed cancer worldwide, and early detection of polyps via colonoscopy is essential for improving patient survival. However, automatic polyp segmentation faces three key challenges: balancing global context with local detail, delineating ambiguous boundaries under low contrast, and handling large variations in polyp size and morphology. To address these challenges, we propose WMA-Net, a Wavelet-Guided Mamba-Attention Network that uses wavelet-domain semantic–boundary separation as the organizing design principle. Rather than introducing a new individual operator, the contribution lies in how existing components—wavelet decomposition, Mamba state space modeling, multi-directional pixel difference convolution, and uncertainty-aware reverse attention—are combined and coordinated within one boundary-aware framework. The architecture integrates pixel difference convolution for multi-directional edge detection, frequency-selective cross-scale fusion with dual-stream wavelet-domain processing, Mamba-based multi-scale aggregation with linear complexity, and uncertainty-aware progressive boundary refinement. Extensive experiments on five public polyp benchmarks demonstrate state-of-the-art performance on four out of five datasets. On the seen datasets, WMA-Net achieves mean Dice scores of 94.4% on CVC-ClinicDB and 93.6% on Kvasir-SEG. On the unseen datasets, WMA-Net attains 91.7% on CVC-300, 82.3% on CVC-ColonDB, and 83.8% on ETIS-LaribPolypDB, demonstrating robust cross-dataset generalization. Comprehensive ablation studies validate the effectiveness and synergy of each proposed module. Full article

(This article belongs to the Special Issue Artificial Intelligence for Signal, Image, and Multimodal Data Processing: Algorithms, Models, and Knowledge Extraction)

► Show Figures

Figure 1

16 pages, 1495 KB

Open AccessArticle

DDCATNet: Effective Deep Learning-Based Illumination Color Cast Estimation Approach for Achieving Computational Color Constancy

by Ho-Hyoung Choi

Sensors 2026, 26(11), 3313; https://doi.org/10.3390/s26113313 - 23 May 2026

Abstract

Digital camera sensors are designed to capture a wide range of incident illuminants, enabling the creation of high-quality images. However, these sensors lack the capability to differentiate between the color of the source illuminant and the actual color (or original color) of the [...] Read more.

Digital camera sensors are designed to capture a wide range of incident illuminants, enabling the creation of high-quality images. However, these sensors lack the capability to differentiate between the color of the source illuminant and the actual color (or original color) of the object being captured. For this reason, the computational color constancy (CCC) was introduced and has been developed over decades. The CCC is an approach to modeling the color perception of the human visual system (HVS) by ensuring accurate object color determination under varying source illuminant conditions. At the core of human visual perception (HVP)-based CCC is attaining higher accuracy in scene illuminant estimation. The emergence of deep convolutional neural networks (DCNNs) was a recent innovation in accurate illuminant estimation, fundamentally transforming the CCC research landscape. Nevertheless, accurate illuminant estimation still remains a huge challenge for both traditional and state-of-the-art (SOTA) approaches. To further advance precision in illuminant estimation, this article presents a novel learning-based illumination color cast estimation approach to HVP-based CCC. Most importantly, the proposed approach is intended to integrate informative features into both channel and spatial regions while preserving long-term dependency feature information with the use of dense skip connections. To achieve these objectives, the proposed Dense Dual Connection Aggregated Transform Network (DDCATNet) architecture is designed to comprise several modules: shallow feature extraction, channel-wise and spatial feature-based Dense Dual Connection (DDC), fusion of the dense channel-wise attention (CA) and spatial attention (SA) branches through a gate mechanism (GM) unit, and aggregate transform. It is worth noting that both the CA blocks and the SA blocks in the DDC module are characterized by dense and cascading connections, meant to preserve long-term feature information and modulate different-level feature information at both global and local scales. The densely connected CA branch (DCA) and the densely connected SA branch (DSA) are also highly effective in securing high-contribution information while suppressing redundant data. The GM unit is integrated at the back of the DDC module, fusing the two DCA and DSA branches to ensure the adaptive merging of useful hierarchical feature information and the extraction of more valuable feature information. As a result, the proposed DDCATNet architecture significantly enhanced precision in illuminant estimation, thereby improving performance. In rigorous experiments on a wide range of datasets, the proposed DDCATNet approach outperformed its SOTA counterparts, validating the efficacy and generalization capabilities, as well as robust camera-invariance, across diverse, single- and multi-illuminant datasets and model architectures. Full article

(This article belongs to the Section Sensing and Imaging)

19 pages, 5072 KB

Open AccessArticle

MDCL-DETR: Multi-Domain Enhancement and Cross-Layer Feature Fusion for Small Object Detection

by Tianran Hao, Xiao Zhang and Bing Zhou

Sensors 2026, 26(11), 3305; https://doi.org/10.3390/s26113305 - 22 May 2026

Viewed by 148

Abstract

Small object detection in uncrewed aerial vehicle (UAV) imagery is hindered by limited pixels, insufficient detailed information, and strong background interference, leading to weak feature representation and poor contextual modeling. To address these issues, we propose a multi-domain enhancement and cross-layer feature fusion [...] Read more.

Small object detection in uncrewed aerial vehicle (UAV) imagery is hindered by limited pixels, insufficient detailed information, and strong background interference, leading to weak feature representation and poor contextual modeling. To address these issues, we propose a multi-domain enhancement and cross-layer feature fusion detection Transformer (MDCL-DETR) with progressive feature processing. First, a multi-domain enhancement module (MDEM) based on CSP (cross stage partial) structure is proposed, which fuses spatial and frequency-domain features in a lightweight manner to enhance object detail and global structures while effectively distinguishing object features from background interference. Second, a cross-layer feature extraction module (CLEM) is introduced to aggregate multi-scale features across layers, alleviate information loss caused by downsampling, and preserve spatial details of small objects while integrating high-level contextual semantics. Meanwhile, a gated Mamba fusion module (GMFM) is proposed, which adopts the Mamba architecture for long-range dependency modeling of multi-scale features and integrates a gating mechanism to realize the dynamic weighted fusion of local details and global context, further improving feature discriminability and global modeling capability. Finally, a fine-grained enhancement module (FGEM) is designed, which leverages feature reorganization and adaptive feature extraction to reinforce and compensate fine-grained features. Extensive experimental results validate the effectiveness and generalization of the proposed method, achieving mAP

_{50}

scores of

54.1 %

and

56.2 %

on the VisDrone2019 and AI-TOD datasets. Full article

(This article belongs to the Section Sensing and Imaging)

26 pages, 6128 KB

Open AccessArticle

Reliability-Guided Adaptive Feature Fusion Network for Noise-Robust Bearing Fault Diagnosis

by Song Yang, Mei Liu, Yukang Chen, Jianfeng Zhang, Peng Wang and Pengfei Luo

Sensors 2026, 26(11), 3288; https://doi.org/10.3390/s26113288 - 22 May 2026

Viewed by 68

Abstract

Cross-noise fault diagnosis remains challenging due to the mismatch between training and testing noise conditions, which degrades feature reliability and model generalization. To address this issue, this paper proposes a reliability-guided adaptive feature fusion framework (RGAF-Net). The method focuses on sample-wise adaptive feature [...] Read more.

Cross-noise fault diagnosis remains challenging due to the mismatch between training and testing noise conditions, which degrades feature reliability and model generalization. To address this issue, this paper proposes a reliability-guided adaptive feature fusion framework (RGAF-Net). The method focuses on sample-wise adaptive feature fusion, where the enhanced wide first-layer convolutional neural network(WDCNN) backbone is employed to improve multi-scale feature extraction under noisy environments. In addition, a dual-path architecture is introduced to provide complementary representations, including globally robust structural representations and locally detail-sensitive structural responses. Furthermore, a lightweight reliability estimation module is designed to characterize the signal degradation tendency under noisy conditions of each input sample, based on which a sample-wise routing mechanism dynamically adjusts feature contributions during feature fusion. Experiments on two public bearing datasets (PU and JNU) under cross-noise settings demonstrate that the proposed method achieves improved performance compared with representative approaches, particularly under severe noise conditions. For example, on the JNU dataset at −10 dB, the proposed method improves the Macro-F1 score by over 19 percentage points compared with the baseline WDCNN. Ablation studies and visualization analyses further demonstrate the effectiveness and adaptive fusion behavior of the proposed framework. The results indicate that the proposed method provides an effective solution for robust fault diagnosis under noise mismatch scenarios. Full article

(This article belongs to the Section Fault Diagnosis & Sensors)

29 pages, 4755 KB

Open AccessArticle

DenseViT-OCT: A Hybrid CNN-Transformer Architecture with Multi-Scale Dense Feature Aggregation for Automated Epiretinal Membrane Severity Classification

by Elif Yusufoğlu, Salih Taha Alperen Özçelik, Orhan Atila, Numan Halit Guldemir and Abdulkadir Sengur

Tomography 2026, 12(6), 76; https://doi.org/10.3390/tomography12060076 - 22 May 2026

Viewed by 76

Abstract

Background/Objectives: Epiretinal membrane (ERM) is a common vitreoretinal disorder characterized by fibrocellular proliferation on the inner retinal surface, often leading to progressive visual impairment. Accurate grading of ERM severity using optical coherence tomography (OCT) is critical for treatment planning and surgical decision-making; however, [...] Read more.

Background/Objectives: Epiretinal membrane (ERM) is a common vitreoretinal disorder characterized by fibrocellular proliferation on the inner retinal surface, often leading to progressive visual impairment. Accurate grading of ERM severity using optical coherence tomography (OCT) is critical for treatment planning and surgical decision-making; however, manual grading is labor-intensive and subjective. This study aims to develop an automated and reliable deep learning-based method for ERM severity classification. Methods: We propose DenseViT-OCT, a hybrid deep learning model that integrates dense convolutional neural networks (CNN) and vision transformers (ViT). The model introduces three key modules: Multi-Scale Dense Feature Aggregation (MDFA) for capturing hierarchical features across multiple spatial scales, Adaptive Feature Calibration (AFC) for enhancing feature discrimination through channel and spatial attention, and Cross-Attention Feature Fusion (CAFF) for enabling bidirectional interaction between convolutional and transformer representations. The model was trained and evaluated on 2195 OCT B-scan images obtained from 397 patients. Results: DenseViT-OCT achieved an overall accuracy of 94.76% on the internal four-class test set, outperforming 19 benchmark models, including ConvNeXt, EfficientNet, ViT, and Swin Transformers. The model demonstrated balanced performance with a macro-averaged precision of 93.76%, recall of 93.22%, F1-score of 93.47%, Cohen’s kappa of 92.62%, and macro-Area Under the Curve (AUC) of 98.95%. Ablation experiments confirmed the contribution of the proposed MDFA, AFC, CAFF, and deep supervision components, with the full model consistently outperforming reduced variants and standalone DenseNet121 and ViT-B/16 backbones. In repeated experiments across five random seeds, DenseViT-OCT also achieved the best mean accuracy (0.9399 ± 0.0052). External validation on the public multicenter OCTDL dataset, performed as binary ERM-versus-normal classification because of label availability, yielded 90.76% accuracy and 97.61% AUC, indicating promising generalization beyond the development cohort. Conclusions: DenseViT-OCT provides a robust framework for automated ERM severity classification from OCT B-scans. The combination of local CNN features, global transformer context, and dedicated fusion modules improves classification performance and yields clinically meaningful error patterns. Although further stage-wise multicenter validation, volumetric OCT analysis, and prospective clinical assessment are required, the proposed method shows promise as a research-oriented decision-support framework for B-scan-level ERM assessment. Full article

(This article belongs to the Special Issue Medical Image Analysis in CT Imaging)

► Show Figures

Figure 1

24 pages, 467 KB

Open AccessFeature PaperArticle

Atomic Contrastive Verification: Fine-Grained Fact-Checking via Claim Decomposition and Knowledge Graph-Grounded Contrastive Reasoning

by Hyeong-Geun Kim, Tea-Sung Jun and Taeseon Lee

Mathematics 2026, 14(10), 1769; https://doi.org/10.3390/math14101769 - 21 May 2026

Viewed by 177

Abstract

Large language models (LLMs) frequently produce text that is fluent yet factually inconsistent with source documents. Detecting such inconsistency remains challenging, particularly when errors involve subtle entity substitutions, temporal distortions, or relational misattributions embedded within lengthy outputs. We propose Atomic Contrastive Verification (ACV), [...] Read more.

Large language models (LLMs) frequently produce text that is fluent yet factually inconsistent with source documents. Detecting such inconsistency remains challenging, particularly when errors involve subtle entity substitutions, temporal distortions, or relational misattributions embedded within lengthy outputs. We propose Atomic Contrastive Verification (ACV), a training-free, graph-grounded fact-checking framework that decomposes both generated claims and source documents into atomic claims—minimal, self-contained factual units—and performs structured contrastive reasoning over each unit independently. For each atomic claim, ACV extracts a knowledge graph triple and generates contrastive claim variants through a multi-type perturbation taxonomy covering entity, relation, temporal, and quantitative dimensions. A novel Knowledge-Weighted Contrastive MMR mechanism, integrating graph-structural centrality and NLI-based logical diversity, selects the most discriminative subset of variants. Each selected variant is then pairwise compared against the claim; the resulting comparison responses are summarized to produce a per-claim verdict, and per-claim verdicts are aggregated into a document-level judgment. Experiments on the LLM-AggreFact benchmark (eleven subsets) demonstrate that ACV achieves competitive or superior performance compared to both specialized fine-tuned fact-checkers and large-scale LLMs. Beyond accuracy, ACV provides interpretable, claim-level error localization that existing methods cannot offer. Full article

(This article belongs to the Section E: Applied Mathematics)

► Show Figures

Figure 1

27 pages, 5714 KB

Open AccessArticle

Dynamic World Shannon Entropy as a Scale-Sensitive Indicator of Surface Urban Heat Island Intensity: Evidence from Seven Romanian Cities

by Zsolt Magyari-Sáska and Ionel Haidu

Remote Sens. 2026, 18(10), 1658; https://doi.org/10.3390/rs18101658 - 21 May 2026

Viewed by 195

Abstract

Surface urban heat island intensity is shaped not only by land-cover composition but also by the spatial heterogeneity of urban surfaces. This study evaluates whether Shannon entropy derived from Dynamic World class probabilities can serve as a robust indicator of pointwise SUHI intensity [...] Read more.

Surface urban heat island intensity is shaped not only by land-cover composition but also by the spatial heterogeneity of urban surfaces. This study evaluates whether Shannon entropy derived from Dynamic World class probabilities can serve as a robust indicator of pointwise SUHI intensity across seven major Romanian cities. Summer daytime Landsat 8/9 observations for 2021–2025 were harmonized into multi-year median land surface temperature composites, while Dynamic World probabilities were used to compute normalized Shannon entropy at 90, 150, 300, and 600 m aggregation windows. SUHI was defined relative to a rural reference whose delineation was examined through a multi-parameter sensitivity analysis, after which entropy–SUHI relationships were modeled using generalized additive models with and without an additional spatial smooth. Across all seven cities, the entropy–SUHI relationship was consistently negative, with higher entropy values tending to be associated with lower local thermal excess. The best-supported models were usually obtained at 150 m and more broadly within the 150–300 m range, while very coarse aggregation weakened performance. Spatially adjusted models explained 57.2–82.4% of SUHI deviance, showing that entropy is consistently associated with a stable but partial component of intra-urban thermal variability. Alternative tied-best rural delineations mainly shifted the SUHI baseline and left the fitted entropy response essentially unchanged. Our findings support probability-based entropy as a reliable, scale-sensitive descriptor of urban surface mixture relevant to intra-urban thermal patterning across diverse geographical and climatic settings. Full article

► Show Figures

Figure 1

23 pages, 2533 KB

Open AccessArticle

Attention-Enhanced Segmentation for Vegetation and Snow Cover Extraction Supporting Grassland Fire Danger Factor Monitoring

by Weiping Liu, Shuye Chen, Yun Yang and Yili Zheng

Fire 2026, 9(5), 210; https://doi.org/10.3390/fire9050210 - 20 May 2026

Viewed by 164

Abstract

Grassland fire is one of the major disasters threatening regional ecological security. Its occurrence, development, and spread are closely related to the spatial distribution and coverage of surface vegetation and snow cover across grassland areas. As the primary combustible fuel source, higher vegetation [...] Read more.

Grassland fire is one of the major disasters threatening regional ecological security. Its occurrence, development, and spread are closely related to the spatial distribution and coverage of surface vegetation and snow cover across grassland areas. As the primary combustible fuel source, higher vegetation coverage increases fuel load and continuity, thereby directly determining grassland fire danger levels and accelerating fire spread velocity. In contrast, snow cover imposes an indirect regulatory effect on the spatiotemporal pattern of fire danger factors: it lowers surface temperature, raises near-surface humidity, and restricts the germination and growth of herbaceous vegetation in cold seasons, which effectively reduces available combustible materials and weakens regional fire hazard conditions. Therefore, accurately obtaining the coverage status of vegetation (direct combustible fuel factor) and snow cover (indirect fire-regulating factor) in complex grassland scenarios is the essential premise for reliable grassland fire danger monitoring, early warning, disaster prevention and control, and regional ecological management. Aiming at the practical problems in complex grassland scenarios (such as undulating terrain, uneven vegetation growth, large differences in snow depth, and complex lighting conditions), including difficulty in extracting vegetation and snow-covered areas, blurred and confusing boundaries, and low accuracy in coverage calculation, which seriously restrict the technical bottleneck of precise monitoring of grassland fire danger factors, this study takes near-ground images collected by grassland fire danger factor monitoring stations as the core data source, and proposes an improved UNet image segmentation model combined with image segmentation technology and deep learning methods to realize precise extraction of vegetation and snow-covered areas and efficient calculation of coverage in complex scenarios. To improve the model’s feature extraction ability, boundary localization accuracy, and reduce model parameters and computational overhead, the CBAM-ASPP (Convolutional Block Attention Module—Atrous Spatial Pyramid Pooling) module is integrated at the end of the encoding path. The attention mechanism is used to enhance the weight of key features, and the multi-scale receptive field of atrous spatial pyramid pooling is utilized to strengthen the model’s ability to fuse features of vegetation and snow areas of different scales. The residual attention mechanism is introduced in the upsampling stage to effectively alleviate the gradient disappearance problem, improve the model’s ability to accurately locate the boundaries of vegetation and snow areas, and reduce segmentation errors. In the training process, a dynamically weighted hybrid loss function is adopted to dynamically adjust the weights according to the segmentation difficulty of different types of samples during training, optimize the model training effect, and improve the segmentation accuracy and generalization ability. Experiments were conducted using near-ground images of typical complex grassland scenarios as the dataset, and the performance of the proposed model was verified through comparative experiments. The results show that in the vegetation segmentation task, the mean Intersection over Union (mIoU) of the model reaches 84.70%, and the accuracy rate is 91.28%, which are 1.48 and 1.58 percentage points higher than those of the standard UNet model, respectively. In the snow segmentation task, the mIoU of the model reaches 92.74%, and the accuracy rate is 94.19%, which are 2.39 and 2.36 percentage points higher than those of the standard UNet model, respectively. At the same time, the number of parameters of the model is reduced by 12.85% compared with the standard UNet. Also, its comprehensive performance is significantly better than that of mainstream image segmentation models such as FCN, SegNet, and DeepLabv3+. Based on the standardized time-series data retrieved by the optimized segmentation model, this study further constructs a Grassland Fire Risk Index (GFRI) using the Analytic Hierarchy Process (AHP). Pearson correlation verification confirms that the GFRI has an extremely significant positive correlation with historical fire frequency, accurately capturing the seasonal dynamic rhythm of regional grassland fire occurrence. This integrated framework of intelligent segmentation and fire risk quantification provides a reliable technical solution for grassland fire factor monitoring, dynamic fire risk assessment, early warning systems, and refined regional ecological management. Full article

(This article belongs to the Special Issue Forest Fuel Treatment and Fire Risk Assessment, 2nd Edition)

► Show Figures

Figure 1

27 pages, 72468 KB

Open AccessArticle

Long-Tailed Remote Sensing Image Classification via Multi-Scale Data, Pre-Trained Model, and Efficient Inference Strategy

by Song Han, Xing Han, Yibo Xu, Yongqin Tian, Weidong Zhang and Wenyi Zhao

Remote Sens. 2026, 18(10), 1636; https://doi.org/10.3390/rs18101636 - 19 May 2026

Viewed by 238

Abstract

Remote sensing image classification is one of the fundamental tasks in the field of remote sensing and plays a critical role in Earth observation applications. However, the inherent multi-scale characteristics of this task pose significant challenges to scene classification. To address these issues, [...] Read more.

Remote sensing image classification is one of the fundamental tasks in the field of remote sensing and plays a critical role in Earth observation applications. However, the inherent multi-scale characteristics of this task pose significant challenges to scene classification. To address these issues, we propose a novel framework that integrates the Contrastive Language–Image Pre-training (CLIP) model, multi-scale data, and efficient inference strategy. The proposed framework transfers general-purpose features learnt from natural images to remote sensing image classification. Specifically, this framework leverages the rich feature representations learnt by the CLIP model in the contrastive learning procedure and adopts it as the backbone network of the model to extract fine-grained and multi-scale features for remote sensing images. That is, the model can learn local fine-grained details but also encode global contextual information useful for the classification of visually similar scene categories. Afterwards, AdapterFormer module is inserted into the few selected layers of CLIP model, which can effectively enhance model performance and have low computational overhead. This helps efficient knowledge sharing and introduces new features at the model level. Furthermore, to alleviate possible performance deterioration brought about by multi-scale feature variation, a multi-scale training set is constructed at data level, providing complementary multi-scale information. Through the synergy of all these strategies above, the proposed method greatly improves the classification performance of multi-scale remote sensing images. Extensive experiments on the MEET dataset (it includes 80 fine categories and more than 800,000 samples) show that the proposed method greatly improves the performance. Compared with general-purpose classification networks and remote sensing-related models, the proposed method always gets state-of-the-art results. Full article

(This article belongs to the Special Issue Hyperspectral Remote Sensing Image Analysis via Advanced Deep Learning and Computer Vision)

► Show Figures

Figure 1

25 pages, 7136 KB

Open AccessArticle

Vibration-Based Condition Monitoring of Ground Engaging Tools Using Finite Element-Derived Modal Features

by Shasha Chen, Bernard F. Rolfe, James Griffin, Arnaldo Delli Carri and Michael P. Pereira

Vibration 2026, 9(2), 36; https://doi.org/10.3390/vibration9020036 - 19 May 2026

Viewed by 83

Abstract

Ground engaging tool (GET) wear monitoring is important for mining excavator maintenance, but progressive multi-tooth wear estimation remains insufficiently explored. This study presents a vibration-based framework for GET wear estimation during operations using modal analysis, finite element (FE) modelling, and machine learning as [...] Read more.

Ground engaging tool (GET) wear monitoring is important for mining excavator maintenance, but progressive multi-tooth wear estimation remains insufficiently explored. This study presents a vibration-based framework for GET wear estimation during operations using modal analysis, finite element (FE) modelling, and machine learning as a supporting evaluation tool. A laboratory-scale mining bucket surrogate with detachable attached masses was used to represent progressive tooth wear through controlled mass-loss conditions. Experimental impact hammer tests under approximately free-free boundary conditions were conducted to validate the FE modal model through natural-frequency comparison and qualitative mode correspondence. The validated FE model was then used to generate a broader dataset of multi-tooth wear scenarios, from which the first ten natural frequencies were extracted as modal features. Linear Regression (LR) was adopted as a simple and interpretable baseline to evaluate both overall wear estimation and individual tooth wear estimation. High accuracy was obtained for overall wear estimation for both the non-symmetric and symmetry-augmented datasets, with R² values of 0.9983 and 0.9976, respectively. In contrast, individual tooth prediction was more challenging, and the symmetry-augmented results showed that mirrored tooth locations can produce non-unique frequency-based signatures. An additional asymmetric FE sensitivity study further confirmed that structural symmetry can limit local wear identifiability when only global natural frequencies are used. These findings demonstrate the potential of FE-derived modal frequency features for laboratory-scale GET wear assessment, while also highlighting the limitations of frequency-only features for unique local wear localisation in symmetric structures. This is a promising approach for wear estimation during mining operations. Full article

(This article belongs to the Special Issue New Trends in Experimental and Numerical Vibroacoustic Techniques—Physics Guided and Datas Guided Approaches)

► Show Figures

Figure 1

22 pages, 4822 KB

Open AccessArticle

LMamba: Local-Guided Mamba with Multi-Scale Filtering for Hyperspectral Image Classification

by Xiaofei Yang, Yao Wei, Jiarong Tan, Shuqi Li, Haojin Tang and Waixi Liu

Remote Sens. 2026, 18(10), 1629; https://doi.org/10.3390/rs18101629 - 19 May 2026

Viewed by 177

Abstract

Deep learning methods have significantly improved hyperspectral image (HSI) classification by exploiting hierarchical feature learning to integrate spatial and spectral information, thus significantly improving classification accuracy. Nevertheless, current deep learning approaches (such as CNNs, Transformers and Mamba) still face three major challenges: inadequate [...] Read more.

Deep learning methods have significantly improved hyperspectral image (HSI) classification by exploiting hierarchical feature learning to integrate spatial and spectral information, thus significantly improving classification accuracy. Nevertheless, current deep learning approaches (such as CNNs, Transformers and Mamba) still face three major challenges: inadequate mitigation of spectral redundancy, high computational costs associated with global modeling, and the loss of two-dimensional spatial structure during sequential processing. To address these issues, we propose LMamba, a task-oriented hybrid framework that combines multi-scale convolutional filtering with local-context-conditioned state space modeling for hyperspectral image classification. Rather than introducing a fundamentally new SSM formulation, LMamba focuses on adapting the input-dependent parameter projection of Mamba to HSI data by injecting local 2D neighborhood context into the generation of selective SSM parameters. This design enables the state space module to better preserve spatial continuity while maintaining linear-complexity sequence modeling. The framework consists of two core components. First, the Multi-scale Aggregation and Compression Block (MACB) employs parallel grouped convolutions with varying kernel sizes to capture spatial features at multiple scales while simultaneously reducing spectral redundancy through channel compression. Second, the Locally Guided 2D Scanning Mechanism replaces conventional unidirectional 1D scanning with a context-aware 2D scanning strategy, thereby preserving structural continuity and enhancing feature representation by integrating local neighborhood spatial information into state transitions. Validation on three prominent HSI datasets demonstrates that LMamba consistently outperforms state-of-the-art methods based on CNNs, Transformers, and SSMs as measured by overall accuracy (OA), average accuracy (AA), and the Kappa coefficient. In summary, LMamba provides an efficient and accurate HSI classification framework under the considered benchmark settings, and its compact complexity and low-sample robustness suggest potential usefulness for practical HSI analysis. Full article

(This article belongs to the Special Issue AI-Driven Hyperspectral Image Classification and Processing in Remote Sensing)

► Show Figures

Figure 1

33 pages, 16764 KB

Open AccessArticle

DC-FusionGNN: A Dual-Channel Framework Integrating Global Self-Attention and Local Topology Learning for Identifying Key Resistance Genes Against Fusarium graminearum Infection in Maize

by Yinfei Dai, Mengjiao Qiao, Jie Fan, Shihao Lu, Enshuang Zhao, Yuheng Zhu, Hanbo Liu and Hao Zhang

Plants 2026, 15(10), 1540; https://doi.org/10.3390/plants15101540 - 18 May 2026

Viewed by 127

Abstract

Fusarium graminearum infection of maize induces complex transcriptional reprogramming, yet existing differential-expression and local graph convolutional approaches struggle to capture long-range and multi-scale regulatory dependencies. We propose DC-FusionGNN, a dual-channel fusion graph neural network for key resistance-gene identification. Based on the transcriptome dataset [...] Read more.

Fusarium graminearum infection of maize induces complex transcriptional reprogramming, yet existing differential-expression and local graph convolutional approaches struggle to capture long-range and multi-scale regulatory dependencies. We propose DC-FusionGNN, a dual-channel fusion graph neural network for key resistance-gene identification. Based on the transcriptome dataset GSE174508, we first construct a comprehensive gene interaction network by integrating a WGCNA co-expression network with a STRING-based interaction network. The left channel combines structure-aware propagation with a Transformer-based global self-attention mechanism to model long-range cross-module dependencies, while the right channel couples GraphSAGE with a GCN to capture local topology and neighborhood heterogeneity. Embeddings from the two channels are concatenated to form a unified gene representation, trained via self-supervised link prediction. Compared with baseline graph neural networks, DC-FusionGNN achieves competitive and overall improved performance across multiple metrics, and robustness and independent cross-species (rice, GSE39635) experiments further confirm its stability and generalization ability. GO and KEGG enrichment analyses show that the top-ranked candidate genes are significantly enriched in plant defense responses, hormone signaling, and secondary metabolism, supporting the biological relevance of the model’s predictions. Full article

(This article belongs to the Special Issue Applications of Bioinformatics in Plant Science)

► Show Figures

Figure 1

18 pages, 7647 KB

Open AccessArticle

WS-DINO: A DINOv2-Based Weed Segmentation Method with Feature Priors and Spatial Fusion

by Hongsheng Zhou, Jiangping Liu, Rigeng Wu and Baoping Zhao

Agriculture 2026, 16(10), 1105; https://doi.org/10.3390/agriculture16101105 - 18 May 2026

Viewed by 265

Abstract

Weed segmentation is a fundamental task in precision agriculture, essential for targeted intervention and sustainable farming. However, achieving accurate segmentation remains challenging due to the high visual similarity between weeds and crops, as well as the ambiguous, fine-grained boundaries often present in complex [...] Read more.

Weed segmentation is a fundamental task in precision agriculture, essential for targeted intervention and sustainable farming. However, achieving accurate segmentation remains challenging due to the high visual similarity between weeds and crops, as well as the ambiguous, fine-grained boundaries often present in complex field environments. To address this, we present WS-DINO, a novel weed segmentation network built upon the DINOv2 vision foundation model. Our framework introduces two key innovations: (1) a Feature Prior Module that leverages a Canny-guided refinement process to extract and inject fine-grained cues related to weed texture, morphology, and boundaries into specific blocks of the Vision Transformer; and (2) a Spatial Feature Fusion Module that leverages convolutional layers to generate multi-scale spatial features, which are then fused with the semantically rich token features from DINOv2, effectively compensating for the Transformer’s limitations in capturing local spatial details. Comprehensive evaluation on the public PhenoBench dataset shows that WS-DINO achieves an mIoU of 88.67% and outperforms the evaluated benchmark methods. Moreover, on the challenging MotionBlurred dataset, WS-DINO reaches 88.75% mIoU, showing stable performance under motion blur and degraded visual conditions. Full article

(This article belongs to the Topic Digital Agriculture, Smart Farming and Crop Monitoring)

► Show Figures

Figure 1

17 pages, 4236 KB

Open AccessArticle

MultiTask-Fish: A Shared Backbone Multitask Counting Method for Complex Fish School Scenes

by Sikun Wang, Jing-Wein Wang and Cunwei Lu

Information 2026, 17(5), 491; https://doi.org/10.3390/info17050491 - 17 May 2026

Viewed by 164

Abstract

With the growing demand for intelligent monitoring in land-based aquaculture, rapid and accurate fish counting from visual data has become important for stocking density regulation, feeding management, and production decisions. To address the challenges in above-water fish images, including scale variation, severe occlusion [...] Read more.

With the growing demand for intelligent monitoring in land-based aquaculture, rapid and accurate fish counting from visual data has become important for stocking density regulation, feeding management, and production decisions. To address the challenges in above-water fish images, including scale variation, severe occlusion and adhesion, blurred boundaries, and frequent switching between low- and high-density scenes, this study proposes MultiTask-Fish, a shared backbone multitask counting method. The network uses ResNet34 as the backbone and integrates a feature pyramid network and channel attention to learn unified feature representations. It jointly predicts detection heatmaps, foreground masks, separation boundaries, density maps, density gating, and global count regression, allowing the model to combine local localization cues, structural information, and global statistics. Based on existing polygon annotations, heatmap, mask, boundary, and density supervision are automatically generated for integrated multitask training. Experiments on 495 fish images, including 346 training and 149 validation images, showed that the proposed method achieved an MAE of 5.875, an RMSE of 11.839, and an MAPE of 0.152 on the validation set, while reducing the MAE on the high-density subset from 16.717 to 13.895. These results demonstrate its effectiveness for fish counting in complex above-water aquaculture scenes. Full article

(This article belongs to the Section Artificial Intelligence)

► Show Figures

Figure 1

Search Results (850)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (850)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI