Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (315)

Search Parameters:
Keywords = boundary semantic information

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
22 pages, 2919 KiB  
Article
A Feasible Domain Segmentation Algorithm for Unmanned Vessels Based on Coordinate-Aware Multi-Scale Features
by Zhengxun Zhou, Weixian Li, Yuhan Wang, Haozheng Liu and Ning Wu
J. Mar. Sci. Eng. 2025, 13(8), 1387; https://doi.org/10.3390/jmse13081387 - 22 Jul 2025
Abstract
The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface [...] Read more.
The accurate extraction of navigational regions from images of navigational waters plays a key role in ensuring on-water safety and the automation of unmanned vessels. Nonetheless, current technological methods encounter significant challenges in addressing fluctuations in water surface illumination, reflective disturbances, and surface undulations, among other disruptions, in turn making it challenging to achieve rapid and precise boundary segmentation. To cope with these challenges, in this paper, we propose a coordinate-aware multi-scale feature network (GASF-ResNet) method for water segmentation. The method integrates the attention module Global Grouping Coordinate Attention (GGCA) in the four downsampling branches of ResNet-50, thus enhancing the model’s ability to capture target features and improving the feature representation. To expand the model’s receptive field and boost its capability in extracting features of multi-scale targets, the Avoidance Spatial Pyramid Pooling (ASPP) technique is used. Combined with multi-scale feature fusion, this effectively enhances the expression of semantic information at different scales and improves the segmentation accuracy of the model in complex water environments. The experimental results show that the average pixel accuracy (mPA) and average intersection and union ratio (mIoU) of the proposed method on the self-made dataset and on the USVInaland unmanned ship dataset are 99.31% and 98.61%, and 98.55% and 99.27%, respectively, significantly better results than those obtained for the existing mainstream models. These results are helpful in overcoming the background interference caused by water surface reflection and uneven lighting in the aquatic environment and in realizing the accurate segmentation of the water area for the safe navigation of unmanned vessels, which is of great value for the stable operation of unmanned vessels in complex environments. Full article
(This article belongs to the Section Ocean Engineering)
Show Figures

Figure 1

18 pages, 1332 KiB  
Article
SC-LKM: A Semantic Chunking and Large Language Model-Based Cybersecurity Knowledge Graph Construction Method
by Pu Wang, Yangsen Zhang, Zicheng Zhou and Yuqi Wang
Electronics 2025, 14(14), 2878; https://doi.org/10.3390/electronics14142878 - 18 Jul 2025
Viewed by 285
Abstract
In cybersecurity, constructing an accurate knowledge graph is vital for discovering key entities and relationships in security incidents buried in vast unstructured threat reports. Traditional knowledge-graph construction pipelines based on handcrafted rules or conventional machine learning models falter when the data scale and [...] Read more.
In cybersecurity, constructing an accurate knowledge graph is vital for discovering key entities and relationships in security incidents buried in vast unstructured threat reports. Traditional knowledge-graph construction pipelines based on handcrafted rules or conventional machine learning models falter when the data scale and linguistic variety grow. GraphRAG, a retrieval-augmented generation (RAG) framework that splits documents into fixed-length chunks and then retrieves the most relevant ones for generation, offers a scalable alternative yet still suffers from fragmentation and semantic gaps that erode graph integrity. To resolve these issues, this paper proposes SC-LKM, a cybersecurity knowledge-graph construction method that couples the GraphRAG backbone with hierarchical semantic chunking. SC-LKM applies semantic chunking to build a cybersecurity knowledge graph that avoids the fragmentation and inconsistency seen in prior work. The semantic chunking method first respects the native document hierarchy and then refines boundaries with topic similarity and named-entity continuity, maintaining logical coherence while limiting information loss during the fine-grained processing of unstructured text. SC-LKM further integrates the semantic comprehension capacity of Qwen2.5-14B-Instruct, markedly boosting extraction accuracy and reasoning quality. Experimental results show that SC-LKM surpasses baseline systems in entity-recognition coverage, topology density, and semantic consistency. Full article
(This article belongs to the Section Artificial Intelligence)
Show Figures

Figure 1

21 pages, 5616 KiB  
Article
Symmetry-Guided Dual-Branch Network with Adaptive Feature Fusion and Edge-Aware Attention for Image Tampering Localization
by Zhenxiang He, Le Li and Hanbin Wang
Symmetry 2025, 17(7), 1150; https://doi.org/10.3390/sym17071150 - 18 Jul 2025
Viewed by 175
Abstract
When faced with diverse types of image tampering and image quality degradation in real-world scenarios, traditional image tampering localization methods often struggle to balance boundary accuracy and robustness. To address these issues, this paper proposes a symmetric guided dual-branch image tampering localization network—FENet [...] Read more.
When faced with diverse types of image tampering and image quality degradation in real-world scenarios, traditional image tampering localization methods often struggle to balance boundary accuracy and robustness. To address these issues, this paper proposes a symmetric guided dual-branch image tampering localization network—FENet (Fusion-Enhanced Network)—that integrates adaptive feature fusion and edge attention mechanisms. This method is based on a structurally symmetric dual-branch architecture, which extracts RGB semantic features and SRM noise residual information to comprehensively capture the fine-grained differences in tampered regions at the visual and statistical levels. To effectively fuse different features, this paper designs a self-calibrating fusion module (SCF), which introduces a content-aware dynamic weighting mechanism to adaptively adjust the importance of different feature branches, thereby enhancing the discriminative power and expressiveness of the fused features. Furthermore, considering that image tampering often involves abnormal changes in edge structures, we further propose an edge-aware coordinate attention mechanism (ECAM). By jointly modeling spatial position information and edge-guided information, the model is guided to focus more precisely on potential tampering boundaries, thereby enhancing its boundary detection and localization capabilities. Experiments on public datasets such as Columbia, CASIA, and NIST16 demonstrate that FENet achieves significantly better results than existing methods. We also analyze the model’s performance under various image quality conditions, such as JPEG compression and Gaussian blur, demonstrating its robustness in real-world scenarios. Experiments in Facebook, Weibo, and WeChat scenarios show that our method achieves average F1 scores that are 2.8%, 3%, and 5.6% higher than those of existing state-of-the-art methods, respectively. Full article
Show Figures

Figure 1

19 pages, 14033 KiB  
Article
SCCA-YOLO: Spatial Channel Fusion and Context-Aware YOLO for Lunar Crater Detection
by Jiahao Tang, Boyuan Gu, Tianyou Li and Ying-Bo Lu
Remote Sens. 2025, 17(14), 2380; https://doi.org/10.3390/rs17142380 - 10 Jul 2025
Viewed by 328
Abstract
Lunar crater detection plays a crucial role in geological analysis and the advancement of lunar exploration. Accurate identification of craters is also essential for constructing high-resolution topographic maps and supporting mission planning in future lunar exploration efforts. However, lunar craters often suffer from [...] Read more.
Lunar crater detection plays a crucial role in geological analysis and the advancement of lunar exploration. Accurate identification of craters is also essential for constructing high-resolution topographic maps and supporting mission planning in future lunar exploration efforts. However, lunar craters often suffer from insufficient feature representation due to their small size and blurred boundaries. In addition, the visual similarity between craters and surrounding terrain further exacerbates background confusion. These challenges significantly hinder detection performance in remote sensing imagery and underscore the necessity of enhancing both local feature representation and global semantic reasoning. In this paper, we propose a novel Spatial Channel Fusion and Context-Aware YOLO (SCCA-YOLO) model built upon the YOLO11 framework. Specifically, the Context-Aware Module (CAM) employs a multi-branch dilated convolutional structure to enhance feature richness and expand the local receptive field, thereby strengthening the feature extraction capability. The Joint Spatial and Channel Fusion Module (SCFM) is utilized to fuse spatial and channel information to model the global relationships between craters and the background, effectively suppressing background noise and reinforcing feature discrimination. In addition, the improved Channel Attention Concatenation (CAC) strategy adaptively learns channel-wise importance weights during feature concatenation, further optimizing multi-scale semantic feature fusion and enhancing the model’s sensitivity to critical crater features. The proposed method is validated on a self-constructed Chang’e 6 dataset, covering the landing site and its surrounding areas. Experimental results demonstrate that our model achieves an mAP0.5 of 96.5% and an mAP0.5:0.95 of 81.5%, outperforming other mainstream detection models including the YOLO family of algorithms. These findings highlight the potential of SCCA-YOLO for high-precision lunar crater detection and provide valuable insights into future lunar surface analysis. Full article
Show Figures

Figure 1

15 pages, 1359 KiB  
Article
Phoneme-Aware Hierarchical Augmentation and Semantic-Aware SpecAugment for Low-Resource Cantonese Speech Recognition
by Lusheng Zhang, Shie Wu and Zhongxun Wang
Sensors 2025, 25(14), 4288; https://doi.org/10.3390/s25144288 - 9 Jul 2025
Viewed by 328
Abstract
Cantonese Automatic Speech Recognition (ASR) is hindered by tonal complexity, acoustic diversity, and a lack of labelled data. This study proposes a phoneme-aware hierarchical augmentation framework that enhances performance without additional annotation. A Phoneme Substitution Matrix (PSM), built from Montreal Forced Aligner alignments [...] Read more.
Cantonese Automatic Speech Recognition (ASR) is hindered by tonal complexity, acoustic diversity, and a lack of labelled data. This study proposes a phoneme-aware hierarchical augmentation framework that enhances performance without additional annotation. A Phoneme Substitution Matrix (PSM), built from Montreal Forced Aligner alignments and Tacotron-2 synthesis, injects adversarial phoneme variants into both transcripts and their aligned audio segments, enlarging pronunciation diversity. Concurrently, a semantic-aware SpecAugment scheme exploits wav2vec 2.0 attention heat maps and keyword boundaries to adaptively mask informative time–frequency regions; a reinforcement-learning controller tunes the masking schedule online, forcing the model to rely on a wider context. On the Common Voice Cantonese 50 h subset, the combined strategy reduces the character error rate (CER) from 26.17% to 16.88% with wav2vec 2.0 and from 38.83% to 23.55% with Zipformer. At 100 h, the CER further drops to 4.27% and 2.32%, yielding relative gains of 32–44%. Ablation studies confirm that phoneme-level and masking components provide complementary benefits. The framework offers a practical, model-independent path toward accurate ASR for Cantonese and other low-resource tonal languages. This paper presents an intelligent sensing-oriented modeling framework for speech signals, which is suitable for deployment on edge or embedded systems to process input from audio sensors (e.g., microphones) and shows promising potential for voice-interactive terminal applications. Full article
Show Figures

Figure 1

28 pages, 10581 KiB  
Article
A Textual Semantic Analysis Framework Integrating Geographic Metaphors and GIS-Based Spatial Analysis Methods
by Yu Liu, Zhen Ren, Kaifeng Wang, Qin Tian, Xi Kuai and Sheng Li
Symmetry 2025, 17(7), 1064; https://doi.org/10.3390/sym17071064 - 4 Jul 2025
Viewed by 361
Abstract
Geographic information systems (GISs) have shown considerable promise in enhancing textual semantic analysis. Current textual semantic analysis methods face significant limitations in accurately delineating semantic boundaries, identifying semantic clustering patterns, and representing knowledge evolution. To address these issues, this study proposes a framework [...] Read more.
Geographic information systems (GISs) have shown considerable promise in enhancing textual semantic analysis. Current textual semantic analysis methods face significant limitations in accurately delineating semantic boundaries, identifying semantic clustering patterns, and representing knowledge evolution. To address these issues, this study proposes a framework that innovatively introduces GIS methods into textual semantic analysis and aligns them with the conceptual foundation of geographical metaphor theory. Specifically, word embedding models are employed to endow semantic primitives with comprehensive, high-dimensional semantic representations. GIS methods and geographical metaphors are subsequently utilized to project both semantic primitives and their relationships into a low-dimensional geospatial analog, thereby constructing a semantic space model that facilitates accurate delineation of semantic boundaries. On the basis of this model, spatial correlation measurements are adopted to reveal underlying semantic patterns, while knowledge evolution is represented using ArcGIS 10.7-based visualization techniques. Experiments on social media data validate the effectiveness of the framework in semantic boundary delineation and clustering pattern identification. Moreover, the framework supports dynamic three-dimensional visualization of topic evolution. Importantly, by employing specialized visualization methods, the proposed framework enables the intuitive representation of semantic symmetry and asymmetry within semantic spaces. Full article
(This article belongs to the Special Issue Applications Based on Symmetry/Asymmetry in Data Mining)
Show Figures

Figure 1

22 pages, 8689 KiB  
Article
Transfer Learning-Based Accurate Detection of Shrub Crown Boundaries Using UAS Imagery
by Jiawei Li, Huihui Zhang and David Barnard
Remote Sens. 2025, 17(13), 2275; https://doi.org/10.3390/rs17132275 - 3 Jul 2025
Viewed by 316
Abstract
The accurate delineation of shrub crown boundaries is critical for ecological monitoring, land management, and understanding vegetation dynamics in fragile ecosystems such as semi-arid shrublands. While traditional image processing techniques often struggle with overlapping canopies, deep learning methods, such as convolutional neural networks [...] Read more.
The accurate delineation of shrub crown boundaries is critical for ecological monitoring, land management, and understanding vegetation dynamics in fragile ecosystems such as semi-arid shrublands. While traditional image processing techniques often struggle with overlapping canopies, deep learning methods, such as convolutional neural networks (CNNs), offer promising solutions for precise segmentation. This study employed high-resolution imagery captured by unmanned aircraft systems (UASs) throughout the shrub growing season and explored the effectiveness of transfer learning for both semantic segmentation (Attention U-Net) and instance segmentation (Mask R-CNN). It utilized pre-trained model weights from two previous studies that originally focused on tree crown delineation to improve shrub crown segmentation in non-forested areas. Results showed that transfer learning alone did not achieve satisfactory performance due to differences in object characteristics and environmental conditions. However, fine-tuning the pre-trained models by unfreezing additional layers improved segmentation accuracy by around 30%. Fine-tuned pre-trained models show limited sensitivity to shrubs in the early growing season (April to June) and improved performance when shrub crowns become more spectrally unique in late summer (July to September). These findings highlight the value of combining pre-trained models with targeted fine-tuning to enhance model adaptability in complex remote sensing environments. The proposed framework demonstrates a scalable solution for ecological monitoring in data-scarce regions, supporting informed land management decisions and advancing the use of deep learning for long-term environmental monitoring. Full article
(This article belongs to the Section Remote Sensing in Geology, Geomorphology and Hydrology)
Show Figures

Figure 1

32 pages, 5287 KiB  
Article
UniHSFormer X for Hyperspectral Crop Classification with Prototype-Routed Semantic Structuring
by Zhen Du, Senhao Liu, Yao Liao, Yuanyuan Tang, Yanwen Liu, Huimin Xing, Zhijie Zhang and Donghui Zhang
Agriculture 2025, 15(13), 1427; https://doi.org/10.3390/agriculture15131427 - 2 Jul 2025
Viewed by 310
Abstract
Hyperspectral imaging (HSI) plays a pivotal role in modern agriculture by capturing fine-grained spectral signatures that support crop classification, health assessment, and land-use monitoring. However, the transition from raw spectral data to reliable semantic understanding remains challenging—particularly under fragmented planting patterns, spectral ambiguity, [...] Read more.
Hyperspectral imaging (HSI) plays a pivotal role in modern agriculture by capturing fine-grained spectral signatures that support crop classification, health assessment, and land-use monitoring. However, the transition from raw spectral data to reliable semantic understanding remains challenging—particularly under fragmented planting patterns, spectral ambiguity, and spatial heterogeneity. To address these limitations, we propose UniHSFormer-X, a unified transformer-based framework that reconstructs agricultural semantics through prototype-guided token routing and hierarchical context modeling. Unlike conventional models that treat spectral–spatial features uniformly, UniHSFormer-X dynamically modulates information flow based on class-aware affinities, enabling precise delineation of field boundaries and robust recognition of spectrally entangled crop types. Evaluated on three UAV-based benchmarks—WHU-Hi-LongKou, HanChuan, and HongHu—the model achieves up to 99.80% overall accuracy and 99.28% average accuracy, outperforming state-of-the-art CNN, ViT, and hybrid architectures across both structured and heterogeneous agricultural scenarios. Ablation studies further reveal the critical role of semantic routing and prototype projection in stabilizing model behavior, while parameter surface analysis demonstrates consistent generalization across diverse configurations. Beyond high performance, UniHSFormer-X offers a semantically interpretable architecture that adapts to the spatial logic and compositional nuance of agricultural imagery, representing a forward step toward robust and scalable crop classification. Full article
Show Figures

Figure 1

32 pages, 6860 KiB  
Article
Participatory Drawing Methodology for Light in Architecture: Drawing Experienced Space
by Ulrika Wänström Lindh
Buildings 2025, 15(13), 2278; https://doi.org/10.3390/buildings15132278 - 28 Jun 2025
Viewed by 313
Abstract
Visual techniques can capture information about visual experiences in ways that differ from speaking and writing. This article examines drawing as a data collection method in architectural lighting research. Lighting design is a rapidly growing profession, and there is a need to build [...] Read more.
Visual techniques can capture information about visual experiences in ways that differ from speaking and writing. This article examines drawing as a data collection method in architectural lighting research. Lighting design is a rapidly growing profession, and there is a need to build research knowledge of people’s spatial experience of lit environments and to develop methods that capture it. Based on a study in which 16 participants’ experiences of different light scenarios were collected through sketches, semantic rating scales, and deep interviews, the participants drew the boundaries of what they experienced as “the room” and spatial directions inside it. In this study, the 64 sketches were compared in different combinations to detect patterns. The results showed that this drawing method worked well for everybody, both those with and those without professional drawing experience. This method, named Drawing Experienced Space, facilitated finding words and expressions for the experiences of participants, especially for those without training. Full article
(This article belongs to the Special Issue Lighting Design for the Built Environment)
Show Figures

Figure 1

25 pages, 4835 KiB  
Article
Object Tracking Algorithm Based on Multi-Layer Feature Fusion and Semantic Enhancement
by Jing Wang, Yanru Wang, Dan Yuan, Yuxiang Que, Weichao Huang and Yuan Wei
Appl. Sci. 2025, 15(13), 7228; https://doi.org/10.3390/app15137228 - 26 Jun 2025
Viewed by 265
Abstract
The TransT object tracking algorithm, built on Transformer architecture, effectively integrates deep feature extraction with attention mechanisms, thereby enhancing the stability and accuracy of the algorithm. However, this algorithm exhibits insufficient tracking accuracy and boundary box drift when dealing with similar background clutter, [...] Read more.
The TransT object tracking algorithm, built on Transformer architecture, effectively integrates deep feature extraction with attention mechanisms, thereby enhancing the stability and accuracy of the algorithm. However, this algorithm exhibits insufficient tracking accuracy and boundary box drift when dealing with similar background clutter, which directly affects the subsequent tracking process. To overcome this problem, this paper constructs a semantic enhancement model, which utilizes multi-layer feature representations extracted from deep networks, and correlates and fuses shallow features with deep features by using cross-attention. At the same time, in order to adapt to the changes in the surrounding environment of the object and establish good discrimination with similar objects, this paper proposes a dynamic mask strategy to optimize the attention allocation mechanism and finally employs an object template update mechanism to improve the adaptability of the model by comparing the spatio-temporal information of successive frames to update the object template in time, further enhancing its tracking performance in complex scenes. Experimental comparison results demonstrate that the algorithm proposed in this paper can effectively handle similar background clutter, leading to a significant improvement in the overall performance of the tracking model. Full article
Show Figures

Figure 1

25 pages, 18500 KiB  
Article
DBFormer: A Dual-Branch Adaptive Remote Sensing Image Resolution Fine-Grained Weed Segmentation Network
by Xiangfei She, Zhankui Tang, Xin Pan, Jian Zhao and Wenyu Liu
Remote Sens. 2025, 17(13), 2203; https://doi.org/10.3390/rs17132203 - 26 Jun 2025
Viewed by 255
Abstract
Remote sensing image segmentation holds significant application value in precision agriculture, environmental monitoring, and other fields. However, in the task of fine-grained segmentation of weeds and crops, traditional deep learning methods often fail to balance global semantic information with local detail features, resulting [...] Read more.
Remote sensing image segmentation holds significant application value in precision agriculture, environmental monitoring, and other fields. However, in the task of fine-grained segmentation of weeds and crops, traditional deep learning methods often fail to balance global semantic information with local detail features, resulting in over-segmentation or under-segmentation issues. To address this challenge, this paper proposes a segmentation model based on a dual-branch Transformer architecture—DBFormer—to enhance the accuracy of weed detection in remote sensing images. This approach integrates the following techniques: (1) a dynamic context aggregation branch (DCA-Branch) with adaptive downsampling attention to model long-range dependencies and suppress background noise, and (2) a local detail enhancement branch (LDE-Branch) leveraging depthwise-separable convolutions with residual refinement to preserve and sharpen small weed edges. An Edge-Aware Loss module further reinforces boundary clarity. On the Tobacco Dataset, DBFormer achieves an mIoU of 86.48%, outperforming the best baseline by 3.83%; on the Sunflower Dataset, it reaches 85.49% mIoU, a 4.43% absolute gain. These results demonstrate that our dual-branch synergy effectively resolves the global–local conflict, delivering superior accuracy and stability in the context of practical agricultural applications. Full article
Show Figures

Figure 1

24 pages, 595 KiB  
Article
An Empirical Comparison of Machine Learning and Deep Learning Models for Automated Fake News Detection
by Yexin Tian, Shuo Xu, Yuchen Cao, Zhongyan Wang and Zijing Wei
Mathematics 2025, 13(13), 2086; https://doi.org/10.3390/math13132086 - 25 Jun 2025
Viewed by 423
Abstract
Detecting fake news is a critical challenge in natural language processing (NLP), demanding solutions that balance accuracy, interpretability, and computational efficiency. Despite advances in NLP, systematic empirical benchmarks that directly compare both classical and deep models—across varying input richness and with careful attention [...] Read more.
Detecting fake news is a critical challenge in natural language processing (NLP), demanding solutions that balance accuracy, interpretability, and computational efficiency. Despite advances in NLP, systematic empirical benchmarks that directly compare both classical and deep models—across varying input richness and with careful attention to interpretability and computational tradeoffs—remain underexplored. In this study, we systematically evaluate the mathematical foundations and empirical performance of five representative models for automated fake news classification: three classical machine learning algorithms (Logistic Regression, Random Forest, and Light Gradient Boosting Machine) and two state-of-the-art deep learning architectures (A Lite Bidirectional Encoder Representations from Transformers—ALBERT and Gated Recurrent Units—GRUs). Leveraging the large-scale WELFake dataset, we conduct rigorous experiments under both headline-only and headline-plus-content input scenarios, providing a comprehensive assessment of each model’s capability to capture linguistic, contextual, and semantic cues. We analyze each model’s optimization framework, decision boundaries, and feature importance mechanisms, highlighting the empirical tradeoffs between representational capacity, generalization, and interpretability. Our results show that transformer-based models, especially ALBERT, achieve state-of-the-art performance (macro F1 up to 0.99) with rich context, while classical ensembles remain viable for constrained settings. These findings directly inform practical fake news detection. Full article
(This article belongs to the Special Issue Mathematical Foundations in NLP: Applications and Challenges)
Show Figures

Figure 1

17 pages, 1728 KiB  
Article
Spatiotemporal Contextual 3D Semantic Segmentation for Intelligent Outdoor Mining
by Wenhao Yang, Liqun Kuang, Song Wang, Xie Han, Rong Guo, Yongpeng Wang, Haifeng Yue and Tao Wei
Algorithms 2025, 18(7), 383; https://doi.org/10.3390/a18070383 - 24 Jun 2025
Viewed by 249
Abstract
Three-dimensional semantic segmentation plays a crucial role in accurately identifying terrain features and objects by effectively extracting 3D spatial information from the environment. However, the inherent sparsity of point clouds and unclear terrain boundaries in outdoor mining environments significantly complicate the recognition process. [...] Read more.
Three-dimensional semantic segmentation plays a crucial role in accurately identifying terrain features and objects by effectively extracting 3D spatial information from the environment. However, the inherent sparsity of point clouds and unclear terrain boundaries in outdoor mining environments significantly complicate the recognition process. To address these challenges, we propose a novel 3D semantic segmentation network that incorporates spatiotemporal feature aggregation. Specifically, we introduced the Gated Spatiotemporal Clue Encoder, which extracts spatiotemporal context from historical multi-frame point cloud data and combines it with the current scan frame to enhance feature representation. Additionally, the Spatiotemporal Feature State Space Module is proposed to efficiently model long-term spatiotemporal features while minimizing computational and memory overhead. Experimental results show that the proposed method outperforms the baseline model, achieving a 2.1% improvement in mIoU on the self-constructed TZMD_NUC outdoor mining dataset and a 1.9% avg improvement on the public SemanticKITTI dataset. Moreover, the method simultaneously improves computational efficiency, making it more suitable for real-time applications in complex, real-world mining environments. These results validate the effectiveness of the proposed method, offering a promising solution for 3D semantic segmentation in complex, real-world mining environments, where computational efficiency and accuracy are both critical. Full article
Show Figures

Figure 1

20 pages, 67212 KiB  
Article
KPV-UNet: KAN PP-VSSA UNet for Remote Image Segmentation
by Shuiping Zhang, Qiang Rao, Lei Wang, Tang Tang and Chen Chen
Electronics 2025, 14(13), 2534; https://doi.org/10.3390/electronics14132534 - 23 Jun 2025
Viewed by 418
Abstract
Semantic segmentation of remote sensing images is a key technology for land cover interpretation and target identification. Although convolutional neural networks (CNNs) have achieved remarkable success in this field, their inherent limitation of local receptive fields restricts their ability to model long-range dependencies [...] Read more.
Semantic segmentation of remote sensing images is a key technology for land cover interpretation and target identification. Although convolutional neural networks (CNNs) have achieved remarkable success in this field, their inherent limitation of local receptive fields restricts their ability to model long-range dependencies and global contextual information. As a result, CNN-based methods often struggle to capture the comprehensive spatial context necessary for accurate segmentation in complex remote sensing scenes, leading to issues such as the misclassification of small objects and blurred or imprecise object boundaries. To address these problems, this paper proposes a new hybrid architecture called KPV-UNet, which integrates the Kolmogorov–Arnold Network (KAN) and the Pyramid Pooling Visual State Space Attention (PP-VSSA) block. KPV-UNet introduces a deep feature refinement module based on KAN and incorporates PP-VSSA to enable scalable long-range modeling. This design effectively captures global dependencies and abundant localized semantic content extracted from complex feature spaces, overcoming CNNs’ limitations in modeling long-range dependencies and inter-national context in large-scale complex scenes. In addition, we designed an Auxiliary Local Monitoring (ALM) block that significantly enhances KPV-UNet’s perception of local content. Experimental results demonstrate that KPV-UNet outperforms state-of-the-art methods on the Vaihingen, LoveDA Urban, and WHDLD datasets, achieving mIoU scores of 84.03%, 51.27%, and 62.87%, respectively. The proposed method not only improves segmentation accuracy but also produces clearer and more connected object boundaries in visual results. Full article
Show Figures

Figure 1

16 pages, 1058 KiB  
Article
Multi-Scale Context Enhancement Network with Local–Global Synergy Modeling Strategy for Semantic Segmentation on Remote Sensing Images
by Qibing Ma, Hongning Liu, Yifan Jin and Xinyue Liu
Electronics 2025, 14(13), 2526; https://doi.org/10.3390/electronics14132526 - 21 Jun 2025
Viewed by 291
Abstract
Semantic segmentation of remote sensing images is a fundamental task in geospatial analysis and Earth observation research, and has a wide range of applications in urban planning, land cover classification, and ecological monitoring. In complex geographic scenes, low target-background discriminability in overhead views [...] Read more.
Semantic segmentation of remote sensing images is a fundamental task in geospatial analysis and Earth observation research, and has a wide range of applications in urban planning, land cover classification, and ecological monitoring. In complex geographic scenes, low target-background discriminability in overhead views (e.g., indistinct boundaries, ambiguous textures, and low contrast) significantly complicates local–global information modeling and results in blurred boundaries and classification errors in model predictions. To address this issue, in this paper, we proposed a novel Multi-Scale Local–Global Mamba Feature Pyramid Network (MLMFPN) through designing a local–global information synergy modeling strategy, and guided and enhanced the cross-scale contextual information interaction in the feature fusion process to obtain quality semantic features to be used as cues for precise semantic reasoning. The proposed MLMFPN comprises two core components: Local–Global Align Mamba Fusion (LGAMF) and Context-Aware Cross-attention Interaction Module (CCIM). Specifically, LGAMF designs a local-enhanced global information modeling through asymmetric convolution for synergistic modeling of the receptive fields in vertical and horizontal directions, and further introduces the Vision Mamba structure to facilitate local–global information fusion. CCIM introduces positional encoding and cross-attention mechanisms to enrich the global-spatial semantics representation during multi-scale context information interaction, thereby achieving refined segmentation. The proposed methods are evaluated on the ISPRS Potsdam and Vaihingen datasets and the outperformance in the results verifies the effectiveness of the proposed method. Full article
Show Figures

Figure 1

Back to TopTop