MDPI - Publisher of Open Access Journals

16 pages, 4919 KB

Open AccessArticle

EA-UNET: An Enhanced and Efficient Model for Left-Turn Lane

by Haowei Wang, Haixin Liu, Fei Wang, Xingbin Chen, Baogang Li and Jiang Liu

Sensors 2026, 26(9), 2642; https://doi.org/10.3390/s26092642 - 24 Apr 2026

Viewed by 147

Left-turn lanes are critical elements of urban intersections. Accurate and efficient lane detection is essential for the safe navigation of autonomous vehicles. To address the limitations of existing semantic segmentation algorithms—specifically, inadequate detection accuracy, high computational cost, and vulnerability to environmental disturbances—we propose [...] Read more.

Left-turn lanes are critical elements of urban intersections. Accurate and efficient lane detection is essential for the safe navigation of autonomous vehicles. To address the limitations of existing semantic segmentation algorithms—specifically, inadequate detection accuracy, high computational cost, and vulnerability to environmental disturbances—we propose a lightweight deep convolutional neural network named EA-UNet. First, we replace the standard U-Net encoder with EfficientNet-B0 to enhance feature extraction efficiency. Second, we introduce a novel contextual coordination module, termed MP-ASPP, which integrates a Convolutional Block Attention Module (CBAM) to further refine attention mechanisms. Finally, a comprehensive real-world dataset was constructed by collecting videos and images of left-turn waiting areas during real-vehicle testing. Experimental results demonstrate that EA-UNet significantly outperforms the baseline U-Net and other state-of-the-art models, achieving accurate and efficient segmentation of left-turn lanes even in complex scenes. Full article

(This article belongs to the Section Vehicular Sensing)

28 pages, 7089 KB

Open AccessArticle

Multi-Scale Context-Aware Network Implementation for Efficient Image Semantic Segmentation

by Yi Yang and Chong Guo

Appl. Sci. 2026, 16(8), 4033; https://doi.org/10.3390/app16084033 - 21 Apr 2026

Viewed by 166

Abstract

Image semantic segmentation is essential in autonomous driving, medical imaging, and remote sensing. While convolutional neural networks (CNNs) excel at local feature extraction and spatial structure modeling, their limited receptive fields restrict the capture of long-range dependencies and global semantic consistency. Transformers provide [...] Read more.

Image semantic segmentation is essential in autonomous driving, medical imaging, and remote sensing. While convolutional neural networks (CNNs) excel at local feature extraction and spatial structure modeling, their limited receptive fields restrict the capture of long-range dependencies and global semantic consistency. Transformers provide strong global modeling through self-attention but often lack local inductive bias and show weaker generalization on small datasets. To address these limitations, this paper proposes a Multi-Scale Context-aware Network (MSC-Net) for image semantic segmentation. Under an encoder–decoder framework, MSC-Net combines a convolutional backbone with a Multi-Scale Self-Attention module to integrate the complementary strengths of CNNs and attention mechanisms. The backbone extracts local texture and structural information and can adopt architectures such as MobileNet, Xception, DRN, and ResNet, while the attention module captures long-range dependencies and multi-scale contextual information. This design improves cross-layer feature collaboration, multi-scale feature fusion, and boundary quality while maintaining computational efficiency. Experimental results show that MSC-Net achieves 38.8% mIoU and 98.4% ACC under comparable computational settings. Compared with SegFormer and DeepLabV3+, the model improves mIoU by approximately +3.0 and +3.3 percentage points, respectively, while reducing FLOPs and parameter size. Full article

► Show Figures

Figure 1

15 pages, 662 KB

Open AccessArticle

A Hybrid Multi-Domain Feature Fusion Model Integrating MEEMD and Dual CNN for Iris Recognition

by Zine. Eddine Louriga, Ismail Jabri, Aziza El Ouaazizi and Anass El Affar

Mach. Learn. Knowl. Extr. 2026, 8(4), 111; https://doi.org/10.3390/make8040111 - 21 Apr 2026

Viewed by 293

Abstract

Iris biometric systems are recognized as secure alternatives to conventional authentication methods, yet challenges such as variable illumination, noise, and intricate iris textures persist. To address these issues, our study presents a novel hybrid iris recognition framework that integrates advanced deep learning with [...] Read more.

Iris biometric systems are recognized as secure alternatives to conventional authentication methods, yet challenges such as variable illumination, noise, and intricate iris textures persist. To address these issues, our study presents a novel hybrid iris recognition framework that integrates advanced deep learning with a pioneering application of Multivariate Ensemble Empirical Mode Decomposition (MEEMD) for feature extraction—a method not previously applied in this context. Our framework first employs MEEMD to extract statistical features that capture the iris’s nonlinear and nonstationary variations. We then combine global semantic information from two pretrained convolutional neural networks—VGG16 and ResNet-152—with local micro-texture details encoded by Local Binary Patterns (LBP) to form a comprehensive feature representation. An efficient pre-processing and segmentation stage precisely isolates the iris region, and the resulting features are refined through dimensionality reduction techniques to yield a robust, compact representation. These features are subsequently classified using multiple models, each rigorously tuned via hyperparameter optimization. Experimental validation on benchmark datasets—including IITD, CASIA, and UBIRIS.v2—shows that our model achieves recognition rates of up to 98% on IITD, 97% on CASIA, and 97.30% on UBIRIS.v2, surpassing existing approaches. This work not only enhances iris recognition performance but also establishes a novel method that bridges advanced deep learning with innovative feature extraction for high-security applications. Full article

(This article belongs to the Section Learning)

► Show Figures

Graphical abstract

36 pages, 23824 KB

Open AccessArticle

Differential Morphological Profile Neural Networks for Semantic Segmentation

by David Huangal and J. Alex Hurt

Remote Sens. 2026, 18(8), 1188; https://doi.org/10.3390/rs18081188 - 15 Apr 2026

Viewed by 346

Abstract

Semantic segmentation of overhead remote sensing imagery supports critical applications in mapping, urban planning, and disaster response, yet state-of-the-art segmentation networks are predominantly designed for ground-perspective imagery and do not directly address remote sensing challenges such as extreme scale variation, foreground–background imbalance, and [...] Read more.

Semantic segmentation of overhead remote sensing imagery supports critical applications in mapping, urban planning, and disaster response, yet state-of-the-art segmentation networks are predominantly designed for ground-perspective imagery and do not directly address remote sensing challenges such as extreme scale variation, foreground–background imbalance, and large image sizes. Rather than proposing new architectures, we take an architecture-agnostic approach by incorporating the differential morphological profile (DMP), a multi-scale shape extraction method based on grayscale morphology, as supplementary input to modern segmentation networks. We evaluate two integration strategies: a Direct-In approach, which adapts the input stem to accept DMP channels in place of or alongside RGB data, and a Hybrid DMP dual-stream architecture in which separate RGB and DMP encoders process each modality independently. Experiments on the iSAID, ISPRS Potsdam, and LoveDA benchmark datasets assess multiple DMP differentials and structuring element shapes. Results show that use of the DMP as direct input into models generally under-perform RGB-only baselines, while the Hybrid DMP approach substantially closes this gap and in some cases surpasses baseline performance, with gains varying across object categories. In the strongest case, a Hybrid DMP SegNeXt-S model achieves a gain of

+ 3.19

mIoU over the RGB-only baseline on the ISPRS Potsdam dataset, and Hybrid DMP models outperform the RGB-only baseline on two of the three benchmark datasets evaluated. These findings suggest that DMP features provide complementary shape information that, when properly integrated, can enhance semantic segmentation performance for overhead remote sensing imagery. Full article

(This article belongs to the Section AI Remote Sensing)

► Show Figures

Figure 1

23 pages, 20258 KB

Open AccessArticle

Mining Scene Classification and Semantic Segmentation Using 3D Convolutional Neural Networks

by André Estevam Costa Oliveira, Matheus Corrêa Domingos, Valdivino Alexandre de Santiago Júnior and Maria Isabel Sobral Escada

Remote Sens. 2026, 18(8), 1112; https://doi.org/10.3390/rs18081112 - 8 Apr 2026

Viewed by 345

Abstract

High spatio-temporal resolution satellite imagery has become increasingly accessible thanks to advancements in the aerospace industry which, combined with a growing computational power, has enabled the spring of novel techniques regarding recognition in remote sensing (RS) images. However, there is still a lack [...] Read more.

High spatio-temporal resolution satellite imagery has become increasingly accessible thanks to advancements in the aerospace industry which, combined with a growing computational power, has enabled the spring of novel techniques regarding recognition in remote sensing (RS) images. However, there is still a lack of studies around 3D convolutions for spatio-temporal data applied to classification problems in RS. Hence, this study investigates the feasibility of 3D convolutional neural networks (3DCNNs) within a spatio-temporal perspective for scene classification and semantic segmentation in RS images, focusing on the identification of mining sites. We firstly developed a dataset covering several parts of Brazil based on MapBiomas products and Planet imagery, then we evaluated the effectiveness of 3DCNNs in capturing temporal information from a sequence of monthly captured images. Moreover, not only for scene classification but also for semantic segmentation, we compared 3D and 2D approaches. As for scene classification, a 3DCNN was better than the corresponding 2D model, while a 2D U-Net was better than a U-Net3D for semantic segmentation. The main explanation for this lies in the fact that a less costly annotation and training time strategy was adopted, but this may have harmed spatio-temporal approaches for semantic segmentation but not for scene classification. However, U-Net3D presented the highest Precision of all models, meaning that it is highly accurate when it predicts a positive. Moreover, 3DCNN (U-Net3D) presented significantly better performance with respect to semantic segmentation compared to other spatio-temporal approaches like ConvLSTM+U-Net and TempCNN. Sensitivity analysis revealed that the near-infrared (NIR) band played a decisive role in distinguishing mining areas, emphasizing its importance in highlighting subtle spectral variations associated with land-cover disturbances. Full article

(This article belongs to the Section Environmental Remote Sensing)

► Show Figures

Figure 1

9 pages, 1667 KB

Open AccessProceeding Paper

Cost-Effective Device with Semantic Segmentation Capability for Real-Time Detection and Classification of Marine Litter in Benthic Coastal Areas

by John Paul T. Cruz, Josiah Izaak D. Lopez, Marlon V. Maddara, Karl Justin B. Nacito, Marites B. Tabanao, Vladimer B. Kobayashi and Roben A. Juanatas

Eng. Proc. 2026, 134(1), 34; https://doi.org/10.3390/engproc2026134034 - 7 Apr 2026

Viewed by 200

Abstract

Anthropogenic marine debris (AMD) in shallow coastal benthic areas poses serious threats to ecosystems, human health, and the economy. Addressing this issue is hindered by limited data on AMD distribution and classification. We explored the use of semantic segmentation, specifically Pyramid Scene Parsing [...] Read more.

Anthropogenic marine debris (AMD) in shallow coastal benthic areas poses serious threats to ecosystems, human health, and the economy. Addressing this issue is hindered by limited data on AMD distribution and classification. We explored the use of semantic segmentation, specifically Pyramid Scene Parsing Network (PSPNet) and Deep Convolutional Neural Network for Semantic Image Segmentation, Version 3, (DeepLabV3) models, for automated AMD detection and classification. The performance was evaluated using mean intersection over union (mIoU), pixel accuracy, and frames per second (FPS). PSPNet achieved a higher mIoU (77.03%) than DeepLabV3 (75.98%), indicating better object identification. However, DeepLabV3 outperformed PSPNet in pixel accuracy (92.24% vs. 92.01%) and FPS (8.83 vs. 6.92), making it more appropriate for real-time applications. To enable real-time identification and classification of AMD, the models are deployed in a minicomputer with adequate processing power, significantly enhancing the models’ frame rate during real-time image processing. While both models are effective, DeepLabV3 is recommended for real-time AMD segmentation. The study contributes to improving AMD monitoring and management in coastal environments through AI-driven solutions. Full article

(This article belongs to the Proceedings of The 7th Eurasia Conference on IoT, Communication and Engineering 2025 (ECICE 2025))

► Show Figures

Figure 1

15 pages, 3194 KB

Open AccessArticle

Detection of Microplastics in Coastal Environments Based on Semantic Segmentation

by Javier Lorenzo-Navarro, José Salas-Cáceres, Modesto Castrillón-Santana, May Gómez and Alicia Herrera

Microplastics 2026, 5(2), 66; https://doi.org/10.3390/microplastics5020066 - 3 Apr 2026

Viewed by 403

Abstract

Microplastics represent an emerging threat to aquatic ecosystems, human health, and coastal aesthetics, with increasing concern about their accumulation on beaches due to ocean currents, wave action, and accidental spills. Despite their environmental impact, current methods for detecting and quantifying microplastics remain largely [...] Read more.

Microplastics represent an emerging threat to aquatic ecosystems, human health, and coastal aesthetics, with increasing concern about their accumulation on beaches due to ocean currents, wave action, and accidental spills. Despite their environmental impact, current methods for detecting and quantifying microplastics remain largely manual, time-consuming, and spatially limited. In this study, we propose a deep learning-based approach for the semantic segmentation of microplastics on sandy beaches, enabling pixel-level localization of small particles under real-world conditions. Twelve segmentation models were evaluated, including U-Net and its variants (Attention U-Net, ResUNet), as well as state-of-the-art architectures such as LinkNet, PAN, PSPNet, and YOLOv11 with segmentation heads. Models were trained and tested on augmented data patches, and their performance was assessed using Intersection over Union (IoU) and Dice coefficient metrics. LinkNet achieved the best performance with a Dice coefficient of 80% and an IoU of 72.6% on the test set, showing superior capability in segmenting microplastics even in the presence of visual clutter such as debris or sand variation. Qualitative results support the quantitative findings, highlighting the robustness of the model in complex scenes. Full article

(This article belongs to the Topic Plastic Contamination (Plastamination): An Environmental and Public Health-Related Concern)

► Show Figures

Figure 1

19 pages, 6364 KB

Open AccessArticle

Integrating Unmanned Aerial Vehicle Imagery and Convolutional Neural Networks for Mapping and Classifying Soil Disturbance in Steep Forest Terrain

by Jaewon Seo, Ikhyun Kim and Byoungkoo Choi

Forests 2026, 17(4), 447; https://doi.org/10.3390/f17040447 - 2 Apr 2026

Viewed by 329

Abstract

Mechanized timber harvesting on steep slopes causes soil disturbance; however, comprehensive post-harvest assessment remains challenging because terrain complexity and safety constraints render traditional field-based methods labor-intensive, spatially limited, and difficult to implement systematically. In this study, we developed and evaluated a convolutional neural [...] Read more.

Mechanized timber harvesting on steep slopes causes soil disturbance; however, comprehensive post-harvest assessment remains challenging because terrain complexity and safety constraints render traditional field-based methods labor-intensive, spatially limited, and difficult to implement systematically. In this study, we developed and evaluated a convolutional neural network-based semantic segmentation model for detecting soil disturbances using high-resolution unmanned aerial vehicle (UAV) imagery in a steep-slope harvested area (2.50 ha, mean slope of 53.4%) in Republic of Korea. A U-Net semantic segmentation model was trained on manually annotated orthomosaic tiles incorporating RGB and digital elevation model (DEM) inputs. Ensemble predictions at an optimized threshold of 0.65 achieved Intersection over Union (IoU) of 0.55 and F1-score of 0.71. Although moderate, these values reflect the inherently challenging conditions of steep-slope forest terrain compared to similar studies conducted under gentler terrain. DEM-derived depth estimation enabled severity classification of the detected disturbances, with light disturbances predominating. Field validation using 38 pinboard measurements demonstrated reliable spatial detection (ρ = 0.567, RMSE = 6.45 cm). This approach provides an effective alternative to traditional monitoring practices in mountainous forests, where systematic trail planning is impractical, and may support evidence-based assessment of harvesting impacts for sustainable forest management. Full article

(This article belongs to the Special Issue The Influence of Mechanized Timber Harvesting on Soils and Stands)

► Show Figures

Figure 1

16 pages, 1529 KB

Open AccessArticle

Image Segmentation-Guided Visual Tracking on a Bio-Inspired Quadruped Robot

by Hewen Xiao, Guangfu Ma and Weiren Wu

Biomimetics 2026, 11(4), 234; https://doi.org/10.3390/biomimetics11040234 - 2 Apr 2026

Viewed by 449

Abstract

Bio-inspired quadrupedal robots exhibit superior adaptability and mobility in unstructured environments, making them suitable for complex task scenarios such as navigation, obstacle avoidance, and tracking in a variety of environments. Visual perception plays a critical role in enabling autonomous behavior, offering a cost-effective [...] Read more.

Bio-inspired quadrupedal robots exhibit superior adaptability and mobility in unstructured environments, making them suitable for complex task scenarios such as navigation, obstacle avoidance, and tracking in a variety of environments. Visual perception plays a critical role in enabling autonomous behavior, offering a cost-effective alternative to multi-sensor systems. This paper proposes an image segmentation-guided visual tracking framework to enhance both perception and motion control in quadruped robots. On the perception side, a cascaded convolutional neural network is introduced, integrating a global information guidance module to fuse low-level textures and high-level semantic features. This architecture effectively addresses limitations in single-scale feature extraction and improves segmentation accuracy under visually degraded conditions. On the control side, segmentation outputs are embedded into a biologically inspired central pattern generator (CPG), enabling coordinated generation of limb and spinal trajectories. This integration facilitates a closed-loop visual-motor system that adapts dynamically to environmental changes. Experimental evaluations on benchmark image segmentation datasets and robotic locomotion tasks demonstrate that the proposed framework achieves enhanced segmentation precision and motion flexibility, outperforming existing methods. The results highlight the effectiveness of vision-guided control strategies and their potential for deployment in real-time robotic navigation. Full article

(This article belongs to the Special Issue Theory and Application of Bioinspired Robotics and Intelligent Control)

► Show Figures

Figure 1

26 pages, 3329 KB

Open AccessArticle

Multi-Class Weed Quantification Based on U-Net Convolutional Neural Networks Using UAV Imagery

by Lucía Sandoval-Pillajo, Marco Pusdá-Chulde, Jorge Pazos-Morillo, Pedro Granda-Gudiño and Iván García-Santillán

Appl. Sci. 2026, 16(7), 3149; https://doi.org/10.3390/app16073149 - 25 Mar 2026

Viewed by 904

Abstract

Weed identification and quantification are processes that are usually manual, subjective, and error-prone. Weeds compete with crops for nutrients, minerals, physical space, sunlight, and water. Thus, weed identification is a crucial component of precision agriculture for autonomous removal and site-specific treatments, efficient weed [...] Read more.

Weed identification and quantification are processes that are usually manual, subjective, and error-prone. Weeds compete with crops for nutrients, minerals, physical space, sunlight, and water. Thus, weed identification is a crucial component of precision agriculture for autonomous removal and site-specific treatments, efficient weed control, and sustainability. Convolutional Neural Networks (CNNs) are very common in weed identification. This work implemented CNN models for semantic segmentation based on the U-Net architecture for automatically segmenting and quantifying weeds in potato crops using RGB images acquired by a drone at 9–10 m height, flying at 1 m/s. Remote sensing images are affected by factors that degrade image quality and the model’s accuracy. Five U-Net variants were evaluated: the original U-Net, Residual U-Net, Double U-Net, Modified U-Net, and AU-Net. The models were trained using the TensorFlow/Keras frameworks on Google Colab Pro+, following the Knowledge Discovery in Databases (KDD) methodology for image analysis. Each model was trained using a diverse custom dataset in uncontrolled environments, considering six classes: background, Broadleaf dock (Rumex obtusifolius), Dandelion (Taraxacum officinale), Kikuyu grass (Cenchrus clandestinum), other weed species, and the crop potato (Solanum tuberosum L.). The models’ segmentation was widely assessed using Mean Dice Coefficient, Mean IoU, and Dice Loss metrics. The results showed that the Residual U-Net model performed the best in multi-class segmentation, achieving a Mean IoU of 0.8021, a performance comparable to or superior to that reported by other authors. Additionally, a Student’s t-test was applied to complement the data analysis, suggesting that the model is reliable for weed quantification. Full article

(This article belongs to the Collection Agriculture 4.0: From Precision Agriculture to Smart Agriculture)

► Show Figures

Figure 1

23 pages, 7102 KB

Open AccessArticle

Detection of Uniform Corrosion in Steel Pipes Using a Mobile Artificial Vision System

by Rafael Antonio Rodríguez Ospino, Cristhian Manuel Durán Acevedo and Jeniffer Katerine Carrillo Gómez

Corros. Mater. Degrad. 2026, 7(1), 21; https://doi.org/10.3390/cmd7010021 - 20 Mar 2026

Viewed by 555

Abstract

Corrosion in steel pipelines can cause critical failures in industrial systems, while conventional inspection methods such as radiography and ultrasonic testing are costly and require specialized personnel. This study presents a mobile computer vision system for automated corrosion detection inside steel pipes using [...] Read more.

Corrosion in steel pipelines can cause critical failures in industrial systems, while conventional inspection methods such as radiography and ultrasonic testing are costly and require specialized personnel. This study presents a mobile computer vision system for automated corrosion detection inside steel pipes using deep learning-based visual analysis. The proposed system consists of a Raspberry Pi 4-based mobile robot equipped with a high-resolution camera for internal inspection. Acquired images were processed using color-space transformations (RGB–HSV), filtering, and segmentation. Convolutional neural networks and semantic segmentation models, including YOLOv8-seg (Instance segmentation) and DeepLabV3 (Semantic segmentation), were trained on a custom corrosion image dataset to identify corroded regions. Real-time visualization was implemented via Flask-based video streaming. Experimental results demonstrated high detection accuracy for uniform corrosion, achieving a mean Intersection over Union (mIoU) above 0.98 and a precision of 0.99 with the YOLOv8-seg model. These results indicate that the proposed system enables reliable and automated corrosion inspection, with the potential to reduce inspection costs and improve operational efficiency. Future work will focus on enhancing real-time performance through hardware optimization. Full article

► Show Figures

Figure 1

26 pages, 4321 KB

Open AccessArticle

Automation of Ultrasonic Monitoring for Resistance Spot Welding Using Deep Learning

by Ryan Scott, Danilo Stocco, Sheida Sarafan, Lukas Behnen, Andriy M. Chertov, Priti Wanjara and Roman Gr. Maev

J. Manuf. Mater. Process. 2026, 10(3), 101; https://doi.org/10.3390/jmmp10030101 - 17 Mar 2026

Viewed by 593

Abstract

Reliable process monitoring and quality evaluation for resistance spot welding (RSW) have become more important now than ever. An ultrasonic probe embedded into welding electrodes has enabled the acquisition of data about molten pool formation throughout welding, but automation of high-performance ultrasonic data [...] Read more.

Reliable process monitoring and quality evaluation for resistance spot welding (RSW) have become more important now than ever. An ultrasonic probe embedded into welding electrodes has enabled the acquisition of data about molten pool formation throughout welding, but automation of high-performance ultrasonic data analyses is still necessary to fully realize a monitoring system. This work proposes a two-stage deep learning (DL) approach for automated ultrasonic data analysis for RSW processing monitoring. The first stage conducts semantic segmentation on ultrasonic M-scan welding process signatures, yielding masks for identified molten pool and stack regions from which weld penetration measurements can be directly extracted, as well as expulsion occurrences throughout welding. From input images and segmentation outputs, the second stage directly estimates resultant weld nugget diameters using an additional neural network. Both stages leveraged architectures based on TransUNet, mixing elements of both convolutional neural networks (CNN) and vision transformers, and the effect of cross-attention for stack-up sheet thickness data fusion was investigated via an ablation study. Additionally, in the diameter estimation stage, the ablation study included alternative feature extraction architectures in the network and investigated the provision of M-scans to the model alongside segmentation masks. In both cases, cross-attention was determined to improve performance, and in the case of diameter estimation, providing M-scans as input was found to be beneficial in general. With cross-attention, the segmentation approach yielded a mean intersection over union (IoU) of 0.942 on molten pool, stack, and expulsion regions in the M-scans with 13.4 ms inference time. With cross-attention, diameter estimates yielded a mean absolute error of 0.432 mm with 4.3 ms inference time, representing a significant improvement over algorithmic approaches based on ultrasonic time of flight. Additionally, the approach attained >90% probability of detection (POD) at 0.830 mm below the acceptable diameter threshold and <10% probability of false alarm (PFA) at 0.828 mm above the threshold. These results demonstrate a novel production-ready application of DL in ultrasonic nondestructive evaluation (NDE) and pave the way for zero-defect RSW manufacturing. Full article

(This article belongs to the Special Issue Recent Advances in Welding and Joining Metallic Materials)

► Show Figures

Figure 1

30 pages, 2135 KB

Open AccessArticle

SBM–Attention U-Net: A Hybrid Transformer Network for Liver Tumor Segmentation in Medical Images

by Yiru Chen, Xuefeng Li, Yang Du, Hui Jiang, Xiaohui Liu, Nan Ma and Xuemei Wang

Sensors 2026, 26(6), 1851; https://doi.org/10.3390/s26061851 - 15 Mar 2026

Viewed by 475

Abstract

This study proposes a novel liver and liver tumor segmentation model. The architecture integrates BiFormer into the bottom two layers of the Attention U-Net encoder to enhance global semantic context modeling and establish long-range pixel-wise dependencies. The proposed spatial-channel dual attention (SCDA) mechanism [...] Read more.

This study proposes a novel liver and liver tumor segmentation model. The architecture integrates BiFormer into the bottom two layers of the Attention U-Net encoder to enhance global semantic context modeling and establish long-range pixel-wise dependencies. The proposed spatial-channel dual attention (SCDA) mechanism is incorporated into the first three encoder layers to refine the fine-grained feature processing capabilities, particularly for precise delineation of liver and tumor boundaries. Eventually, a Mix Structure Block (MSB) is implemented within the decoder to optimize fusion of deep semantic and shallow spatial features, thereby elevating segmentation accuracy. Ablation experiments were conducted on three publicly available datasets. On the 3Dircadb dataset, the mean dice coefficient achieved was 0.9377 and the mean IoU Index achieved was 0.8889. On the LITS dataset, the mean dice coefficient achieved was 0.9257 and the mean IoU Index achieved was 0.8704. On the CHAOS dataset, the mean dice coefficient achieved was 0.9611 and the mean IoU Index achieved was 0.9259. These results validate the functionality and effectiveness of the proposed network model. This study constructed a novel neural network based on attention mechanisms; by enabling precise and automated segmentation directly from raw sensor-acquired medical images, the proposed method enhances the diagnostic value of these imaging sensors, facilitating more accurate clinical decision-making. Full article

(This article belongs to the Special Issue Image Processing and Pattern Recognition Based on Deep Learning for Sensing Applications—3rd Edition)

► Show Figures

Figure 1

24 pages, 4915 KB

Open AccessArticle

Semantic-Guided Matching of Heterogeneous UAV Imagery and Mobile LiDAR Data Using Deep Learning and Graph Neural Networks

by Tee-Ann Teo, Hao Yu and Pei-Cheng Chen

Drones 2026, 10(3), 185; https://doi.org/10.3390/drones10030185 - 8 Mar 2026

Viewed by 430

Abstract

The integration of heterogeneous geospatial data, specifically low-cost unmanned aerial vehicle (UAV) imagery and mobile light detection and ranging (LiDAR) system point clouds, presents a significant challenge due to the significant radiometric and structural discrepancies between the two modalities. This study proposes a [...] Read more.

The integration of heterogeneous geospatial data, specifically low-cost unmanned aerial vehicle (UAV) imagery and mobile light detection and ranging (LiDAR) system point clouds, presents a significant challenge due to the significant radiometric and structural discrepancies between the two modalities. This study proposes a novel air-to-ground semantic feature matching framework to achieve precise geometric registration between these data sources by effectively incorporating semantic-constraint deep learning-based matching. The methodology transformed the cross-sensor alignment challenge into a robust two-dimensional image matching problem. This was achieved by first using YOLOv11 for semantic segmentation of common road markings in both the UAV orthoimage and the converted LiDAR intensity image to generate highly consistent feature references. Subsequently, the SuperPoint detector and a graph neural network matcher, SuperGlue, were applied to these semantic images to establish reliable geomatics information correspondence points. Experimental results confirmed that this semantic-guided strategy consistently outperformed traditional feature-based matching (i.e., scale-invariant feature transform + fast library for approximate nearest neighbors), particularly by converting the noisy LiDAR intensity image into a stabilized semantic representation. The explicit application of semantic constraints further proved effective in eliminating false matches between geometrically similar but semantically distinct objects. The final object-specific analysis demonstrated that features with clear, complex geometric structures (e.g., pedestrian crossings and directional arrows) provide the most robust matching control. In summary, the proposed framework successfully leverages semantic context to overcome cross-sensor heterogeneity, offering an automated and precise solution for the geometric alignment of mobile LiDAR data. Full article

(This article belongs to the Special Issue When Deep Learning Meets Geometry for Air-to-Ground Perception on Drones: 2nd Edition)

► Show Figures

Figure 1

20 pages, 1972 KB

Open AccessArticle

Segmentation Is Not the Purpose: A Wheat Impurity Regression Network Integrating Semantic Segmentation

by Yuhang Bian, Haoze Yu, Xiangdong Li, Xiao Zhang and Dong Li

Agriculture 2026, 16(5), 578; https://doi.org/10.3390/agriculture16050578 - 3 Mar 2026

Viewed by 321

Abstract

Real-time and accurate acquisition of the wheat impurity rate is a key technology for realizing intelligent cleaning operations, and it directly influences the quality of wheat harvest. This study proposes a novel impurity rate regression network named Segmentation is Not The Purpose (SNTP). [...] Read more.

Real-time and accurate acquisition of the wheat impurity rate is a key technology for realizing intelligent cleaning operations, and it directly influences the quality of wheat harvest. This study proposes a novel impurity rate regression network named Segmentation is Not The Purpose (SNTP). SNTP integrates a semantic segmentation network and an impurity rate regression network into a single neural architecture and replaces the DeepLabV3+ backbone with MobileNetV4, which serves as the segmentation branch of SNTP. Furthermore, a Transformer block is introduced into the regression branch to enable global feature extraction, and a Generalized Categorical Regression head is designed based on Distribution Focal Loss to improve regression accuracy. The SNTP model ultimately achieves an MIoU of 77.7%, an MPA of 83.3%, an MAE of 0.045, and an MSE of 0.005 on the validation set, with only 9.51M parameters and 17.98 GMACs of computation, successfully solving the overfitting problem in impurity rate regression networks and achieving high regression accuracy. SNTP is easy to optimize, requires no additional prior knowledge, and the performance of the SNTP model is unaffected by camera mounting height, making it exceptionally versatile for deployment and enabling real-time impurity rate detection, which is the key technology for intelligent cleaning. Full article

(This article belongs to the Section Artificial Intelligence and Digital Agriculture)

► Show Figures

Figure 1

Search Results (1,031)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (1,031)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI