MDPI - Publisher of Open Access Journals

26 pages, 5547 KB

Open AccessArticle

A Lightweight Framework for Tea Shoot Detection and Plucking Point Localization Enabled by Modified YOLOv11s-Seg Model

by Yongmao Huang, Yuankai Luo, Yuanxi Mu and Haiyan Jin

Agriculture 2026, 16(12), 1357; https://doi.org/10.3390/agriculture16121357 (registering DOI) - 20 Jun 2026

In this work, a lightweight framework enabled by the modified YOLOv11s-seg model for tea shoot detection and plucking point localization is proposed. Detecting tea shoots and localizing plucking points with higher accuracy generally require larger model size and more model parameters, making it [...] Read more.

In this work, a lightweight framework enabled by the modified YOLOv11s-seg model for tea shoot detection and plucking point localization is proposed. Detecting tea shoots and localizing plucking points with higher accuracy generally require larger model size and more model parameters, making it difficult to balance accuracy and lightweighting. To overcome this limitation, a modified lightweight YOLOv11s-seg model is developed. First, the multi-scale edge information enhancement is introduced into the conventional YOLOv11s-seg to extract edge feature better and improve the detection accuracy of tea shoots. Meanwhile, context anchor attention is utilized to modify the cross stage partial spatial attention module in a backbone network to improve the detection capability for small objects. Moreover, the detail calibration reconstruction feature pyramid network is proposed. It utilizes spatial and contextual semantic information to reconstruct and calibrate features in key regions, enhancing the capability for object fusion and recognition at various scales. Furthermore, with the modified model performing instance segmentation to acquire the contour of each tea shoot, the coordinates of the three lowest pixel points in the contour are captured to localize the plucking point based on the average coordinates. In addition, the layer-adaptive magnitude-based pruning (LAMP) method is used to lighten the model. The experimental results show that the LAMP-pruned modified YOLOv11s-seg model with a speedup ratio of 1.5 achieves a mAP@0.5 of 86.5% for tea shoot detection, exhibiting a 4.7 percentage point improvement over the conventional YOLOv11s-seg model. Moreover, it exhibits an accuracy of 81.9% for plucking point localization on the validation and test subsets with 232 images in total, and its number of parameters, model size and floating point operations (FLOPs) separately achieve reductions of 67.3%, 66.2%, and 24.9% over the conventional model as well. Therefore, the proposed LAMP-pruned modified model shows good balance between lightweighting and detection accuracy. Finally, the modified LAMP-pruned YOLOv11s-seg model is deployed on a Jetson Orin NX edge module and measured in a tea plantation, with the measured results exhibiting a detection speed of 34.1 FPS and verifying its availability in practical applications. Full article

(This article belongs to the Special Issue Advances in Precision Agriculture in Orchard)

19 pages, 2129 KB

Open AccessArticle

Do It Once: Concatenating the Image Pair for a Single Pass Feature Extraction in Stereo Depth Sensing

by Žan Regoršek and Andrej Žemva

Sensors 2026, 26(12), 3919; https://doi.org/10.3390/s26123919 (registering DOI) - 20 Jun 2026

Abstract

In the field of stereo depth sensing, modern research predominantly prioritizes accuracy, yet inference speed remains a critical bottleneck for practical, real-time applications on resource-constrained platforms. Existing acceleration approaches often rely on lighter network architectures or runtime-specific optimizations, which may require architectural redesign, [...] Read more.

In the field of stereo depth sensing, modern research predominantly prioritizes accuracy, yet inference speed remains a critical bottleneck for practical, real-time applications on resource-constrained platforms. Existing acceleration approaches often rely on lighter network architectures or runtime-specific optimizations, which may require architectural redesign, platform-specific tuning, or accuracy trade-offs. However, a common inefficiency remains in many stereo pipelines: feature extraction is typically performed using two separate forward passes, one for the left image and one for the right, even though both passes use the same network weights. We address this redundancy by concatenating the left and right images into a single combined tensor, enabling feature extraction in one batched pass while preserving the original network architecture. By reducing feature extraction time by up to 48.4%, our results demonstrate that this method accelerates the overall inference rate by 10% to 39% on average on Nvidia V100 and up to 28.4% on edge device, depending on the model architecture. This speedup is achieved at the expense of only a moderate increase in runtime memory consumption, while retaining the original accuracy. Because the method does not alter the core stereo network, it can be applied as a plug-and-play enhancement to both existing and newly developed stereo matching models. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

27 pages, 44553 KB

Open AccessArticle

A Spatial–DCT Feature Fusion Network for Copper Strips and Plates Surface Defect Segmentation

by Jun Liu, Guo Zhang, Yubo Gao, Jianping Wang, Xin Ouyang, Fajia Wan, Zihao Duan and Guolin Che

Appl. Sci. 2026, 16(12), 6211; https://doi.org/10.3390/app16126211 (registering DOI) - 19 Jun 2026

Viewed by 61

Abstract

Instance segmentation of surface defects is one of the research hotspots in the field of image segmentation. Due to limitations such as restricted receptive fields or the loss of fine-grained details, traditional neural network models still struggle to achieve sufficiently high-segmentation accuracy for [...] Read more.

Instance segmentation of surface defects is one of the research hotspots in the field of image segmentation. Due to limitations such as restricted receptive fields or the loss of fine-grained details, traditional neural network models still struggle to achieve sufficiently high-segmentation accuracy for surface defects. To meet the demand for high precision segmentation of surface defects on copper strips and plates in industrial quality inspection, this paper proposes a feature fusion segmentation network, termed DSFFNet. First, a dual-branch structure is designed in DSFFNet to fuse spatial-domain features with discrete cosine transform (DCT)-domain features, thereby obtaining richer feature information. Second, a 2D-DCT frequency feature extraction module is developed to more effectively capture the edge information of targets. Third, a triplet attention mechanism is introduced into the backbone network to form an attention-centric network. Finally, a bidirectional fusion module and a multi-scale fusion network are designed to capture finer-grained feature information. Comparative experiments conducted on the KUST-SEG-Dataset demonstrate that DSFFNet achieves 94.66% ± 1.07% (mask)mAP

_{50}

and 95.38% ± 0.06% (box)mAP

_{50}

, outperforming several classic image segmentation methods. Furthermore, generalization experiments on the public NEU-Seg dataset yield a (mask)mAP

_{50}

of 86.27% ± 0.01%. The generalization results indicate that DSFFNet is robust to datasets with similar defect types. Full article

20 pages, 8064 KB

Open AccessArticle

Centroid Extraction Method Based on Multi-Scale Gaussian Fitting and Subpixel Edge Reconstruction

by Bing Han, Yuanzhang Song, Zhijing Fang, Hangyu Yue, Hongtao Ma, Yuegang Fu and Jian Song

Photonics 2026, 13(6), 594; https://doi.org/10.3390/photonics13060594 (registering DOI) - 18 Jun 2026

Viewed by 128

Abstract

Accurate spot-centroid localization is fundamental for determining optical metrics such as modulation transfer function (MTF) and effective focal length (EFL). Conventional methods struggle under non-ideal conditions—asymmetric spots, high noise, and vibration—and mid-wave infrared (MWIR) vibration has received little attention. To address these gaps, [...] Read more.

Accurate spot-centroid localization is fundamental for determining optical metrics such as modulation transfer function (MTF) and effective focal length (EFL). Conventional methods struggle under non-ideal conditions—asymmetric spots, high noise, and vibration—and mid-wave infrared (MWIR) vibration has received little attention. To address these gaps, we propose multi-scale Gaussian fitting with subpixel edge reconstruction (MSGF-SER), combining image pyramid fitting, Zernike-moment edge extraction, and adaptive eccentricity-weighted fusion. Validated on simulated spots with varying SNRs and experimental sequences (visible off-axis aberration, long-wave infrared (LWIR) high-noise, MWIR micro-vibration), MSGF-SER achieved a noise-free RMSE of 0.03 pixel and 0.84 pixel at 5 dB SNR. On real MWIR vibration sequences, the Y-direction standard deviation (STD) dropped to 0.098 pixel, and the trajectory displacement variance was more than an order of magnitude lower than that of conventional methods. MTF deviations remained within 0.01, and the deviation of the measured mean EFL from the nominal focal length was better than 0.05 mm, and the STD was below 0.02 mm. These results demonstrate that MSGF-SER substantially improves centroid localization accuracy, repeatability, and smoothness under challenging conditions, providing reliable support for high-precision optical system parameter measurement. Full article

(This article belongs to the Special Issue Precision Measurement and Perception: Enabled by Advanced Optical Sensing, Imaging, and LiDAR Technologies)

► Show Figures

Figure 1

20 pages, 13113 KB

Open AccessArticle

An Edge Computing-Enabled UAV-Based Image Mosaicing System Using a Novel B-SIFT-ILS Algorithm

by Linhui Wang, Zhizhuang Liu, Yu Yang, Lizhi Chen, Zhenqi Zhou, Mengyu Zeng and Yonghong Tan

Algorithms 2026, 19(6), 489; https://doi.org/10.3390/a19060489 - 18 Jun 2026

Viewed by 156

Abstract

In UAV-based remote sensing, accurate and efficient image mosaicing is crucial for achieving real-time monitoring. Traditional cloud-centric processing paradigms, however, face core scientific challenges such as high latency, bandwidth bottlenecks, and limited autonomy, making them inadequate for dynamic, real-time scenarios. To address these [...] Read more.

In UAV-based remote sensing, accurate and efficient image mosaicing is crucial for achieving real-time monitoring. Traditional cloud-centric processing paradigms, however, face core scientific challenges such as high latency, bandwidth bottlenecks, and limited autonomy, making them inadequate for dynamic, real-time scenarios. To address these issues, this paper proposes an edge-computing-enabled UAV image mosaicing system. The system consists of a UAV remote sensing platform and an edge computing terminal, with the core being our novel B-SIFT-ILS algorithm. The algorithm first uses geographic coordinates for unified registration, constructs a Gaussian scale space for multi-resolution representation, and then precisely locates extrema in the Difference of Gaussian (DoG) space using a 3D quadratic function. A BANSAC algorithm is subsequently employed to refine feature points and extract stable SIFT features, and finally, Iterative Least Squares (ILS) are used to achieve seamless mosaicing. Experimental results demonstrate that, compared with classical RANSAC, the proposed method achieves superior feature sampling accuracy (rotation: 0.879, translation: 0.877) and lower latency. The ILS-based smoothing stage effectively eliminates noise and ghosting without introducing gradient reversal, performing comparably to deep learning methods while significantly outperforming direct averaging and Gaussian approaches. On the NVIDIA Jetson Orin NX edge terminal, a single processing instance requires only 1124 ms, highlighting its strong potential for real-time, low-latency, and autonomous mosaicing tasks. Future research will focus on extending the approach to non-planar terrains and implementing adaptive parameter tuning for the BANSAC algorithm. Full article

(This article belongs to the Special Issue AI-Driven Optimization for Sustainable Edge-Cloud Continuum)

► Show Figures

Figure 1

27 pages, 14942 KB

Open AccessArticle

An Explainable Deep Learning Framework for Morphology-Aware Coal Surface Characterization and Intelligent Coal Processing

by Mustafa Coşar

Minerals 2026, 16(6), 637; https://doi.org/10.3390/min16060637 - 16 Jun 2026

Viewed by 227

Abstract

Coal surface morphology plays a pivotal role in advanced coal processing operations, encompassing comminution, beneficiation, flotation kinetics, and intelligent process optimization. However, conventional characterization approaches are inherently labor-intensive and susceptible to inter-operator subjectivity, hindering their integration into autonomous industrial monitoring systems. To surmount [...] Read more.

Coal surface morphology plays a pivotal role in advanced coal processing operations, encompassing comminution, beneficiation, flotation kinetics, and intelligent process optimization. However, conventional characterization approaches are inherently labor-intensive and susceptible to inter-operator subjectivity, hindering their integration into autonomous industrial monitoring systems. To surmount these challenges, this study proposes an explainable morphology-aware coal surface characterization framework that synergizes unsupervised pseudo-label generation, targeted data augmentation, and explainable artificial intelligence. Initially, a high-dimensional feature set comprising grayscale intensity statistics, entropy, edge density, and Gray-Level Co-occurrence Matrix descriptors was extracted from 454 coal surface images. Subsequently, Principal Component Analysis and K-means clustering were implemented to identify intrinsic morphology-driven structural patterns, generating robust pseudo-labels without manual annotation. These pseudo-classes were utilized to train and benchmark multiple transfer learning architectures, including EfficientNetB0, VGG16, and MobileNetV3. Experimental results demonstrated that MobileNetV3 achieved superior classification efficacy under a strict leakage-safe evaluation configuration, exceeding 90% across key performance metrics while offering significantly lower computational complexity—ideal for edge-computing deployment. Furthermore, Grad-CAM-based interpretability analysis validated that the models focused on physically significant morphological features, such as fracture boundaries and heterogeneous texture transitions. These findings indicate that the proposed framework provides a robust, computationally efficient, and interpretable decision-support tool for smart beneficiation and intelligent industrial coal processing environments. Full article

(This article belongs to the Special Issue Advanced Coal Processing: Comminution, Concentration, Desulphurization and Process Optimization)

► Show Figures

Figure 1

41 pages, 37891 KB

Open AccessArticle

VNIR Hyperspectral Signatures and Machine Learning for Early Detection and Classification of Barley Diseases

by Rimma M. Ualiyeva, Mariya M. Kaverina and Anastasiya V. Osipova

Plants 2026, 15(12), 1854; https://doi.org/10.3390/plants15121854 - 15 Jun 2026

Viewed by 223

Abstract

This study focuses on identifying barley diseases at various stages using the unique spectral signatures of phytopathogen infections. We examined the causal agents of widespread crop diseases, including: loose smut, head blight, fusarium head blight (FHB), stem rust, net blotch, spot blotch, common [...] Read more.

This study focuses on identifying barley diseases at various stages using the unique spectral signatures of phytopathogen infections. We examined the causal agents of widespread crop diseases, including: loose smut, head blight, fusarium head blight (FHB), stem rust, net blotch, spot blotch, common root rot. Analysing disease-specific spectral characteristics with machine learning (ML) algorithms revealed the most informative spectral ranges: the green region (~520–560 nm), the red chlorophyll absorption zone (~650–680 nm), and the red-edge region (~700 nm). These ranges accurately reflect alterations in the plant’s cellular structure and pigment complexes. Spectral data were processed using five ML algorithms. Random Forest (RF) proved to be the most effective for identifying and differentiating barley diseases, achieving an accuracy of up to 90.13% (MCC = 0.86). This superior performance stems from the ensemble method’s robustness to noise and its ability to extract critical features from high-dimensional hyperspectral data, particularly when distinguishing diseases with overlapping spectral signatures. Furthermore, this study highlights the potential of integrating UAV-based remote sensing to delineate reference zones, proximal hyperspectral imaging (HSI), and ML for robust plant health monitoring. This combined approach shows significant promise for early disease diagnostics, enabling site-specific treatments, curbing disease progression, and reducing pesticide application. Ultimately, these findings offer practical value for the agro-industrial sector in major grain-producing countries, especially in Central Asia, where agricultural advancement is a strategic priority for sustainable development and food security. Full article

(This article belongs to the Section Plant Modeling)

► Show Figures

Figure 1

23 pages, 19029 KB

Open AccessArticle

CETransUNet: An Intelligent Landslide Identification Method Based on Collaborative Optimization of Global Context and Dual Attention Mechanisms

by Tianli Sun, Chengsheng Yang, Jifeng Wu, Zewei Liu, Ziqian Wang and Xiaoqiang Cheng

Remote Sens. 2026, 18(12), 1974; https://doi.org/10.3390/rs18121974 - 13 Jun 2026

Viewed by 205

Abstract

Accurate landslide identification is crucial for enhancing emergency response capabilities during destructive geological hazards. Although deep-learning-based semantic segmentation has demonstrated effectiveness, substantial variations in landslide scales and environmental similarities continue to challenge existing methods. This paper systematically constructs a new co-seismic landslide dataset [...] Read more.

Accurate landslide identification is crucial for enhancing emergency response capabilities during destructive geological hazards. Although deep-learning-based semantic segmentation has demonstrated effectiveness, substantial variations in landslide scales and environmental similarities continue to challenge existing methods. This paper systematically constructs a new co-seismic landslide dataset for the Yarlung Zangbo River basin based on the 2017 Nyingchi earthquake, effectively filling a critical regional data gap. This paper proposes CETransUNet (coordinate attention and edge-guided attention transformer UNet), a novel landslide detection model that integrates ResNet and Transformer architectures. Specifically, a coordinate attention (CA) module is introduced within the skip connections between the encoder and decoder. This module encodes positional information along both horizontal and vertical spatial directions and dynamically re-weights the feature maps, thereby effectively suppressing background noise caused by semantic gaps and enhancing the model’s ability to localize landslide regions. Additionally, an edge-guided attention (EGA) module is incorporated into the decoder. This module extracts explicit edge priors from the input image using a Laplacian operator and imposes geometric constraints on the predictions via a boundary reverse attention mechanism, thereby significantly alleviating boundary ambiguity and morphological distortion of landslides. Evaluations across datasets from the Yarlung Zangbo River, Iburi-Tobu, and Bijie regions demonstrate that CETransUNet significantly outperforms state-of-the-art models—including TransUNet, SegFormer, and SwinUNet—in terms of IoU, MIoU, and F1-score. Overall, through the synergistic optimization of the coordinate attention and edge-guided attention modules, the CETransUNet model achieves synchronous enhancement of boundary integrity and geometric precision in complex scenarios, providing a reliable technical solution for large-scale intelligent landslide identification. Full article

(This article belongs to the Special Issue Advances in Geological Hazard Characterization and Assessment: Merging Remote Sensing with Direct Surveys)

► Show Figures

Figure 1

22 pages, 43415 KB

Open AccessArticle

FSSM: Frequency-Enhanced State Space Modeling with FFT-Based Two-Sided Non-Causal Convolution for Image Dehazing

by Li Zeng and Yinqing Huang

J. Imaging 2026, 12(6), 260; https://doi.org/10.3390/jimaging12060260 - 13 Jun 2026

Viewed by 199

Abstract

Image dehazing is a fundamental visual restoration task for improving visual perception under low-visibility weather conditions, especially in UAV-based remote sensing, traffic monitoring, and surveillance scenarios. Existing convolutional neural networks are effective in local feature extraction but remain limited in long-range dependency modeling, [...] Read more.

Image dehazing is a fundamental visual restoration task for improving visual perception under low-visibility weather conditions, especially in UAV-based remote sensing, traffic monitoring, and surveillance scenarios. Existing convolutional neural networks are effective in local feature extraction but remain limited in long-range dependency modeling, while Transformer-based methods improve global modeling at the cost of high computational complexity. To address these issues, this paper proposes an efficient image-dehazing framework termed FSSM, which integrates frequency-enhanced State Space Modeling with a hierarchical encoder–decoder architecture. Specifically, an FFT-based State Space Block (FFTSSB) is designed to reformulate state propagation as frequency-domain two-sided non-causal convolution, enabling efficient bidirectional global dependency modeling without explicit recursive scanning. Furthermore, a Frequency-Aware Discriminative Enhancement Block (FDEB) is introduced to enhance local textures, edges, and structural details through spatial gating and lightweight block-wise frequency modulation. Based on these two components, a Frequency-Aware State Interaction (FASI) block is constructed to progressively couple global state propagation and local frequency-aware enhancement. Experimental results on the HazyDet dataset demonstrate that FSSM achieves favorable restoration accuracy, structural consistency, and perceptual quality compared with representative dehazing methods. Ablation studies further validate the effectiveness of the proposed two-sided FFT-based state modeling, frequency-aware enhancement, and hierarchical multi-scale design. Full article

(This article belongs to the Topic Computer Vision and Image Processing, 3rd Edition)

► Show Figures

Figure 1

16 pages, 52629 KB

Open AccessArticle

Automatic Segmentation and Recognition of the Microstructure of High-Strength Low-Alloy Steel

by Lu Wang, Ziying Ren, Baoyu Song, Bing Wang, Qiaochuan Chen, Jingjing Wang, Tianpeng Zhou and Yuexing Han

Materials 2026, 19(12), 2554; https://doi.org/10.3390/ma19122554 - 12 Jun 2026

Viewed by 107

Abstract

Metallographic microstructure analysis is essential for understanding the evolution of steel microstructures during heat treatment and mechanical processing. However, accurate analysis of optical micrographs remains difficult because of blurred grain boundaries, grayscale inhomogeneity within grains, and irregular grain morphologies. To address these issues, [...] Read more.

Metallographic microstructure analysis is essential for understanding the evolution of steel microstructures during heat treatment and mechanical processing. However, accurate analysis of optical micrographs remains difficult because of blurred grain boundaries, grayscale inhomogeneity within grains, and irregular grain morphologies. To address these issues, this work proposes an automated metallographic image-processing method based on superpixels, DPSS (dual-phase steel segmentation), with the main contribution focused on microstructure segmentation. First, image contrast and boundary visibility are enhanced by edge detection and sharpening. Then, superpixel segmentation is combined with extracted edge information to improve boundary localization and preserve irregular grain morphology, enabling more complete extraction of grain or particle regions from optical images. The proposed method is validated on optical micrographs of Mn-Si low-alloy steel, and the results show that it provides more accurate and complete segmentation than conventional ImageJ (Version: 1.54f)-based processing. Based on the segmented regions, a lightweight neural network is further used for phase identification. The final classification recognition accuracy can reach 99.91%. This classification result serves to demonstrate that the improved segmentation results can provide more reliable inputs for subsequent microstructure recognition. Overall, the proposed method offers an effective and automated solution for metallographic image segmentation and supports more accurate downstream phase analysis. Full article

(This article belongs to the Section Metals and Alloys)

► Show Figures

Figure 1

25 pages, 14221 KB

Open AccessArticle

Phenology-Adaptive Prediction of Walnut Leaf Area Index from UAV Multispectral Data via Hybrid Feature Selection and SHAP-Enhanced Machine Learning

by Qiuhao Xia, Yerhazi Yerzati, Zihao Li, Jiahui Qi, Jiaxing Chen, Yu Sen, Rui Zhang, Yunqi Zhang, Hongxia Wang and Zhongzhong Guo

Remote Sens. 2026, 18(12), 1941; https://doi.org/10.3390/rs18121941 - 11 Jun 2026

Viewed by 121

Abstract

Accurate monitoring of the leaf area index (LAI) throughout the entire growth cycle of walnut trees using UAV multispectral imagery is essential for digital orchard management. In this study, focusing on the ‘Wen 185’ walnut variety in Xinjiang, we simultaneously acquired UAV multispectral [...] Read more.

Accurate monitoring of the leaf area index (LAI) throughout the entire growth cycle of walnut trees using UAV multispectral imagery is essential for digital orchard management. In this study, focusing on the ‘Wen 185’ walnut variety in Xinjiang, we simultaneously acquired UAV multispectral images and ground-measured LAI data during four critical growth stages: expansion, hard shell, oil conversion, and maturity. A total of 25 vegetation indices and 48 texture features derived from the gray-level co-occurrence matrix were extracted. Hybrid feature selection combining linear (Pearson correlation), nonlinear (maximum information coefficient and random forest importance), and multiple consensus strategies was employed to reduce redundancy. LAI prediction models were constructed using four algorithms: Random Forest (RF), Support Vector Machine (SVM), LASSO, and Ridge Regression (RR), with model interpretability enhanced by SHAP analysis. Results showed that the multiple consensus screening reduced feature redundancy by an average of 69.6%. SHAP identified five core features: Redge_750_Mean, NDVI, B_Mean, RENDVI, and G_Homogeneity. Importantly, predictor importance shifted significantly with phenology: texture features dominated during the expansion stage, while red-edge indices (RENDVI and Redge_750_Mean) became predominant during the hard shell and oil conversion stages, effectively mitigating the saturation problem commonly observed in traditional indices such as NDVI within the LAI range of 1.5–5.8 in this study. The hybrid feature subset combining “red-edge spectrum + spatial texture” with the Random Forest algorithm achieved superior performance across all stages, with the RPD value exceeding 2.0 during the oil conversion stage, indicating excellent estimation capability. This study demonstrates that a “quality over quantity” feature selection strategy not only reduces model complexity but also enables high-precision, dynamic LAI monitoring throughout the entire walnut growth cycle, providing a scientific basis for intelligent management of large-scale orchards in arid regions. Full article

(This article belongs to the Special Issue Applications of Unmanned Aerial Remote Sensing in Precision Agriculture)

► Show Figures

Figure 1

32 pages, 25468 KB

Open AccessArticle

MLE-ResUNet: SWIR Image Super-Resolution Using Along-Track Oversampling and Visible-Light-Guided Deep Learning

by Yongqian Zhu, Bo Cheng, Qianmin Liu, Zhijing He, Tianzhen Ma, Chen Cao, Bangjian Zhao, Miao Hu, Xianqiang He and Chunlai Li

Remote Sens. 2026, 18(12), 1922; https://doi.org/10.3390/rs18121922 - 10 Jun 2026

Viewed by 149

Abstract

Shortwave infrared (SWIR) imagery plays an important role in land–water boundary delineation, coastal monitoring, and complex aquatic environment observation. However, the spatial resolution of SWIR bands is usually lower than that of visible bands, which limits their capability to represent fine-scale targets and [...] Read more.

Shortwave infrared (SWIR) imagery plays an important role in land–water boundary delineation, coastal monitoring, and complex aquatic environment observation. However, the spatial resolution of SWIR bands is usually lower than that of visible bands, which limits their capability to represent fine-scale targets and boundary structures. To address this problem, this study proposes MLE-ResUNet, a SWIR image super-resolution method that integrates along-track oversampling with visible-light-guided deep learning. The proposed method first exploits dual-view SWIR observations with sub-pixel displacement generated by increasing the sampling line rate in the push-broom imaging process. A maximum likelihood estimation (MLE)-based physical prior module is then introduced to transform multi-view degraded observations into a physically consistent latent high-resolution prior. Finally, high-resolution visible images are used to provide edge, texture, and structural guidance, and a ResUNet-based network is employed for multi-source feature fusion and residual reconstruction. Based on multi-region measured data acquired by the LHRSI (Lightweight High-Resolution Spectral Imager) payload onboard the BlueCarbon-1A satellite, a SWIR super-resolution dataset covering typical urban, farmland, and coastal scenarios was constructed. Comparative experiments were conducted against PCA, BDSD, PanNet, GPPNN, and two additional lightweight-guided deep learning baselines, namely LGPConv and a CANConv-style visible-guided baseline. The results show that MLE-ResUNet achieves the best performance across different scenarios and consistently outperforms the comparison methods in terms of SSIM, SAM, ERGAS, and Q-index. The proposed method effectively enhances spatial detail recovery while maintaining favorable spectral consistency. Ablation experiments further demonstrate that both along-track oversampling information and the MLE-based physical prior contribute to improved reconstruction quality and more stable training convergence. These findings indicate that the proposed method can enhance fine-scale SWIR observation capability without substantially increasing hardware complexity, providing an effective technical solution for shoreline identification, land–water boundary extraction, and complex surface target monitoring. Full article

(This article belongs to the Special Issue Advanced Object Detection, Classification and Recognition in VIR Optical and SAR Remote Sensing Imagery)

► Show Figures

Figure 1

31 pages, 3749 KB

Open AccessArticle

Cascaded Dual Stage U-Net with Texture-Aware Feature Fusion for Unified Segmentation and Classification in Echo-Cardiogram Images

by Arakere Nagarajappa Jagadish, Ravikumar Manjunath and Indrakumar Krishnamurthy

Informatics 2026, 13(6), 84; https://doi.org/10.3390/informatics13060084 - 10 Jun 2026

Viewed by 274

Abstract

Accurate, automated analysis of medical images is indispensable for effective diagnosis and treatment planning, particularly for complex multiclass diseases. This paper presents a system that combines a cascaded dual-stage U-Net with texture-based deep learning techniques to improve segmentation and classification precision. The cascaded [...] Read more.

Accurate, automated analysis of medical images is indispensable for effective diagnosis and treatment planning, particularly for complex multiclass diseases. This paper presents a system that combines a cascaded dual-stage U-Net with texture-based deep learning techniques to improve segmentation and classification precision. The cascaded dual-stage U-Net architecture comprises two parallel encoding-decoding pathways optimized for deep semantic feature extraction. This dual-path design enables the network to recognize lesion edges and intricate structural variations across imaging modalities. To enhance diagnostic performance, texture features are extracted using the Color Co-occurrence Matrix (CCM), which preserves local texture patterns and color relationships, providing helpful context for deep feature extraction. We feed this enriched data into a convolutional neural network (CNN) classifier, which categorizes the images into disease groups. Extensive evaluation on benchmark medical image datasets (MRI, CT, endoscopic images) demonstrates the framework’s superior performance in segmentation accuracy, classification precision, and robustness to noise and distortions. Integrating segmentation and classification in a coherent pipeline increases the reliability and interpretability of the diagnostic process. This technique represents an important step toward the clinical utility of intelligent, automated medical image processing. Full article

► Show Figures

Graphical abstract

22 pages, 13414 KB

Open AccessArticle

Boundary-Aware Multi-Scale Feature Enhancement Based Few-Shot Hyperspectral Image Semantic Segmentation

by Xiaorong Zhang, Siyuan Li and Xi Zheng

Remote Sens. 2026, 18(12), 1911; https://doi.org/10.3390/rs18121911 - 9 Jun 2026

Viewed by 183

Abstract

To address the issues of model overfitting under scarce samples and poor segmentation performance on slender objects in the task of semantic segmentation of remote sensing hyperspectral images, this paper proposes a hyperspectral image semantic segmentation framework that integrates edge awareness and multi-scale [...] Read more.

To address the issues of model overfitting under scarce samples and poor segmentation performance on slender objects in the task of semantic segmentation of remote sensing hyperspectral images, this paper proposes a hyperspectral image semantic segmentation framework that integrates edge awareness and multi-scale feature enhancement under extremely few-shot conditions. This architecture effectively integrates orthogonal-direction convolutions, elongated feature enhancement, multi-scale feature fusion, and deep supervision mechanisms, solving challenges such as difficulty in extracting features of slender objects, model overfitting under few-sample conditions, and insufficient generalization ability. The experimental results on multiple public datasets show that the proposed algorithm achieves excellent segmentation performance with just one small-sized sample per labeled category, surpassing existing popular algorithms and thereby confirming the algorithm’s effectiveness and superiority. On the PaviaU dataset, the overall accuracy (OA) and mean intersection over union (mIoU) improved by approximately 9.7% and 15.5% compared to the second-best model; especially for the segmentation of the key elongated feature ‘road’, the intersection over union reached 94.75%, highlighting the effectiveness of the proposed mechanism. This paper provides a novel and efficient solution for fine interpretation of hyperspectral images under few-sample conditions. Full article

(This article belongs to the Special Issue AI-Driven Hyperspectral Image Classification and Processing in Remote Sensing)

► Show Figures

Figure 1

19 pages, 19256 KB

Open AccessArticle

YOLOv11-LicoSeg: A Method for Measuring the Radicle Length of Licorice

by Ruxiao Bai, Haixiu He, Zhibo Zhong, Limin Yu, Xiuqing Fu and Qifeng Wu

AgriEngineering 2026, 8(6), 234; https://doi.org/10.3390/agriengineering8060234 - 9 Jun 2026

Viewed by 198

Abstract

Global climate change and soil salinization pose challenges to licorice cultivation. Evaluating seed vigor based on the dynamic changes in radicle morphology is crucial for screening and cultivating licorice varieties that are tolerant to low temperatures and salts. Traditional manual measurement of licorice [...] Read more.

Global climate change and soil salinization pose challenges to licorice cultivation. Evaluating seed vigor based on the dynamic changes in radicle morphology is crucial for screening and cultivating licorice varieties that are tolerant to low temperatures and salts. Traditional manual measurement of licorice radicle characteristics suffers from issues such as high cost, long time consumption, and large errors. The YOLOv11 instance segmentation model in the field of deep learning offers advantages including a simple architecture, strong lightweight properties, and a unified detection-segmentation framework. Therefore, this study selected the YOLOv11 model to build a deep learning framework and used the continuous time-series crop growth vitality monitoring system to collect full-time-series images of 18 groups of licorice seeds germinating under different temperature and salt stress conditions. The YOLOv11-seg model was improved by adding a Spatial Strip Attention mechanism (SSA) to enhance the spatial correlation of radicle features, replacing ordinary convolutions with a Multi-scale Edge Detail Enhancement Module (MEEM) to optimize multi-scale feature extraction capabilities, and embedding a Normalized Weighted Distance (NWD) loss function to strengthen the segmentation ability for tiny targets. The YOLOv11-LicoSeg model was constructed for segmenting and extracting licorice radicle features and calculating root length. The experimental results showed that the mAP50 of the model’s detection reached 97.4%, mAP50–95 reached 81.7%, the mAP50 of the segmentation mask reached 97.0%, and mAP50–95 reached 78.2%. Compared with the unimproved YOLOv11-seg, the mAP50 of detection increased by 0.7%, mAP50–95 increased by 1.3%, the mAP50 of segmentation increased by 0.7%, and mAP50–95 increased by 0.8%. The linear regression coefficient between manual measurement and machine-vision measurement was 0.94218, and the goodness of fit R² was 0.94408. Using this model and the monitoring system, the morphological evolution of the licorice radicle contour characteristics over the germination time was obtained. The study indicated that the growth of licorice radicles was optimal under salt stress of 1200 µs/cm and 1800 µs/cm. YOLOv11-LicoSeg accurately segmented licorice radicles and calculated radicle length, with the performance to segment 100 licorice radicle images within 7 s. After deployment, it significantly reduced the labor cost and time consumption for acquiring licorice radicle phenotypes. In conclusion, YOLOv11-LicoSeg provides a rapid and accurate method for variety screening in licorice breeding and cultivation. Full article

(This article belongs to the Special Issue Sensors and Computer Vision for Quality Assessment of Agricultural Products)

► Show Figures

Figure 1

Search Results (1,753)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,753)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI