MDPI - Publisher of Open Access Journals

22 pages, 3939 KB

Open AccessArticle

Semantic Segmentation Method for Sparse Point Clouds Based on Straight Flow Completion and Multi-Feature Fusion

by Tong Zheng, Zhiyuan Meng, Chongchong Yu, Tao Xie and Yewang Xu

Sensors 2026, 26(10), 3056; https://doi.org/10.3390/s26103056 - 12 May 2026

Point cloud semantic segmentation is a vital task in 3D computer vision. However, the inherent sparsity of point clouds complicates the segmentation process. In contexts such as autonomous driving, moving objects frequently exhibit motion blur, which adversely affects semantic segmentation performance. These challenges [...] Read more.

Point cloud semantic segmentation is a vital task in 3D computer vision. However, the inherent sparsity of point clouds complicates the segmentation process. In contexts such as autonomous driving, moving objects frequently exhibit motion blur, which adversely affects semantic segmentation performance. These challenges hinder the practical application of point cloud semantic segmentation. To address these issues, this paper presents a novel semantic segmentation method that integrates sparse point cloud completion with multi-feature fusion. Specifically, the study emphasizes the development of efficient strategies for constructing and training point cloud completion models, aiming to expedite model parameter training while maximizing completion accuracy. Additionally, a semantic segmentation model is introduced that combines motion feature-enhanced instance features with semantic features, thereby enhancing adaptability to moving objects. Moreover, point cloud completion and semantic segmentation are linked in an end-to-end pipeline, facilitating accurate semantic segmentation of sparse point clouds in dynamic environments. During the experimental phase, publicly available Lidar point cloud datasets, including SemanticKITTI and the millimeter-wave radar dataset RADIal, are utilized to evaluate the proposed method against classical approaches in terms of point cloud completion performance and semantic segmentation effectiveness, thereby demonstrating the reliability of the proposed method. Full article

(This article belongs to the Section Sensing and Imaging)

► Show Figures

Figure 1

23 pages, 6159 KB

Open AccessArticle

GIDNet: Infrared Small Target Detection Network Based on Gradient-Intensity Decoupled

by Xianwei Gao, Jingtao Wu, Dafeng Cao, Haotian Xu, Yingjie Ma, Lu Li and Mingjing Zhao

Remote Sens. 2026, 18(10), 1527; https://doi.org/10.3390/rs18101527 - 12 May 2026

Abstract

Infrared small target detection (IRSTD) plays a pivotal role in a comprehensive set of applications. Despite the extensive research alongside numerous algorithms proposed in recent years, IRSTD remains a formidable task, primarily stemming from the inherently low level of signal-to-noise ratios (SNR) as [...] Read more.

Infrared small target detection (IRSTD) plays a pivotal role in a comprehensive set of applications. Despite the extensive research alongside numerous algorithms proposed in recent years, IRSTD remains a formidable task, primarily stemming from the inherently low level of signal-to-noise ratios (SNR) as well as the presence of intricate background clutter. Current models remain constrained by three critical bottlenecks: the degradation of spectral coupling between intensity and gradient information in deep layers, limited scale adaptability of static filters, and the loss of spatial precision caused by iterative downsampling. We propose GIDNet, a gradient-intensity decoupled network that balances target energy preservation and noise suppression to address the aforementioned issues. Our GIDNet architecture incorporates three core components: a gradient-intensity synergistic convolution (GISC) designed to synergistically encode intensity and gradient information for robust target enhancement; a multi-scale difference contrast (MSDC) module for scale-adaptive detection via adaptive contrast modeling; and a shallow feature projection (SFP) strategy aimed at maintaining precise spatial localization by bridging the gap between deep semantics and shallow spatial details. Comprehensive evaluations, encompassing both quantitative metrics and qualitative visualizations, consistently demonstrate the preeminence of the developed GIDNet surpassing the performance of 16 counterparts. Full article

(This article belongs to the Special Issue Remote Sensing Data Preprocessing and Calibration)

► Show Figures

Figure 1

34 pages, 15156 KB

Open AccessReview

From Cooperative Dual-Arm Manipulators to Cooperative Multi-Arm Manipulators—Where Are We Standing Today?

by Lander Ketelbuters, Bart Engelen, Ivo Dekker and Karel Kellens

Robotics 2026, 15(5), 97; https://doi.org/10.3390/robotics15050097 (registering DOI) - 11 May 2026

Abstract

This paper highlights the state of the art in Cooperative Dual-Manipulation (CDM) and Cooperative Multi-Manipulation (CMM), comparing advances in modeling, control, planning, sensing, vision, and end-effector technologies. Methods originally established in CDM have been extended or adapted to support higher complexity of CMM. [...] Read more.

This paper highlights the state of the art in Cooperative Dual-Manipulation (CDM) and Cooperative Multi-Manipulation (CMM), comparing advances in modeling, control, planning, sensing, vision, and end-effector technologies. Methods originally established in CDM have been extended or adapted to support higher complexity of CMM. A historical timeline visualizes the steady growth of cooperative manipulation (CM) and the recent acceleration of CMM driven by rising process complexity and the need for more flexible automation strategies. CM is becoming increasingly relevant as industrial processes demand higher payload capacity, larger workspaces, and greater flexibility. In addition, this paper categorizes existing applications by cooperation type and application domain. Here, a clear dominance of simultaneous object manipulation tasks is visible (fixation-fixation). However, fixation-tooling tasks, where one manipulator grasps the product while another performs a tool operation, and tooling-tooling tasks, where multiple manipulators perform tool operations simultaneously, remain significantly underrepresented. A similar imbalance is found for rigid/non-deformable object manipulation and flexible/deformable object manipulation, respectively. Based on this review, several research gaps are identified: (i) reliable flexible object manipulation methods; (ii) CM strategies for disassembly (e.g., battery pack deconstruction); (iii) complexity in control and planning for multi-manipulator systems; (iv) pathways to industrial deployment beyond laboratory demonstrators; and (v) task-specific tooling and end-effector innovation. Full article

(This article belongs to the Section Intelligent Robots and Mechatronics)

27 pages, 1868 KB

Open AccessArticle

DualMambaFormer: A Parallel Hybrid Transformer–Mamba Network for Hyperspectral Image Classification

by Jiang Yu, Jingwei Li, Gan Sun, Jingying Lu, Xuejun Cheng, Ruimeng Zhou, Wei Sun and Xianjun Gao

Remote Sens. 2026, 18(10), 1516; https://doi.org/10.3390/rs18101516 - 11 May 2026

Abstract

Hyperspectral image classification (HSIC) plays a crucial role in fine-grained Earth observation tasks. However, balancing efficient long-range dependency modeling with the extraction of fine-grained local features remains a significant challenge, primarily due to the inherent high-dimensional spectral redundancy and complex spatial variability of [...] Read more.

Hyperspectral image classification (HSIC) plays a crucial role in fine-grained Earth observation tasks. However, balancing efficient long-range dependency modeling with the extraction of fine-grained local features remains a significant challenge, primarily due to the inherent high-dimensional spectral redundancy and complex spatial variability of hyperspectral data. Existing modeling paradigms exhibit distinct limitations: Convolutional Neural Networks (CNNs) are constrained by localized receptive fields, while Vision Transformers (ViTs), despite their global receptive capabilities, incur prohibitive quadratic computational complexity. Meanwhile, the emerging Mamba architecture has demonstrated remarkable effectiveness in sequence modeling with linear complexity, but it often lacks sufficient sensitivity to local textures when directly applied to non-causal 2D images. To address these limitations, this paper proposes a novel parallel hybrid architecture termed DualMambaFormer. Deviating from the traditional serial stacking paradigm, the proposed network utilizes a dual-stream design to achieve the complementary fusion of global static attention and dynamic sequence reasoning. Specifically, the model first employs an SS-ResNet for spectral dimensionality reduction and local feature embedding. Subsequently, the architecture bifurcates into a parallel encoding stage: one branch leverages Multi-Head Self-Attention (MHSA) to capture global spatial correlations, while the other introduces a Local Enhanced Mamba (LEM) branch. By integrating State Space Models (SSM) with depthwise separable convolutions, the LEM branch simultaneously captures long-range causal dependencies and local spatial context. Finally, a dual class token fusion strategy is designed to integrate heterogeneous representations at the decision level. Extensive experiments on four benchmark datasets—Indian Pines, Pavia University, Salinas, and WHU-HongHu—show that DualMambaFormer achieves OA values of 96.56%, 98.95%, 97.60%, and 96.09%, respectively, with consistently high AA and Kappa coefficients. These results demonstrate the effectiveness, robustness, and generalization capability of the proposed method for hyperspectral image classification. Compared with the second-best competing methods, DualMambaFormer improves OA by 5.55, 2.30, 1.68, and 4.30 percentage points on the Pavia University, Indian Pines, Salinas, and WHU-HongHu datasets, respectively. Full article

(This article belongs to the Special Issue Advances in Hyperspectral Remote Sensing Image Processing: 2nd Edition)

33 pages, 7528 KB

Open AccessArticle

A Deep Q-Network and Genetic Algorithm-Based Algorithm for Efficient Task Allocation in UAV Ad Hoc Networks

by Xiaobin Zhang, Jian Cao, Zeliang Zhang, Yuxin Li and Yuhui Li

Electronics 2026, 15(10), 2041; https://doi.org/10.3390/electronics15102041 - 11 May 2026

Abstract

As the number of unmanned aerial vehicles (UAVs) and the volume of computational tasks increase in UAV ad hoc networks (UAVANET), the solution space for task allocation strategies grows exponentially. In practical emergency scenarios with concurrent multi-user access, multi-UAV systems equipped with mobile [...] Read more.

As the number of unmanned aerial vehicles (UAVs) and the volume of computational tasks increase in UAV ad hoc networks (UAVANET), the solution space for task allocation strategies grows exponentially. In practical emergency scenarios with concurrent multi-user access, multi-UAV systems equipped with mobile edge computing (MEC) devices face challenges such as limited computing resources and imbalanced task distribution during task offloading. To address these challenges, this paper proposes an adaptive task allocation algorithm named AUSTA-DQHO (Adaptive UAV Swarm Task Allocation using Deep Q-networks and Genetic Algorithms Hybrid Optimization), which combines Deep Q-Network (DQN) with Genetic Algorithm (GA), aiming to optimize computational task scheduling and minimize both the total task delay and the variance in task delays. First, we introduce a multi-UAV-assisted MEC application framework. In this framework, UAVs equipped with high-performance computing modules are deployed as airborne servers in the target area, providing data offloading and task computation support for IoT devices. Next, to tackle the optimization problem, we replace the random action selection process in DQN with a hybrid strategy that incorporates heuristic methods—specifically, GA and greedy algorithms—to perform global search and make more effective decisions for optimal task allocation for each offloading request. Furthermore, to accelerate the convergence of the AUSTA-DQHO policy while ensuring global optimality, we introduce a pre-clustering mechanism and a dynamic weighting factor for randomly generated task offloading requests in the target area. These mechanisms effectively reduce the solution space and ensure that optimal actions are learned at different stages of the training process. Experimental results demonstrate that the proposed algorithm achieves a task latency reduction of 18.72% and a load balancing improvement of 98.72%, surpassing the performance of the other algorithms. Additionally, we explore the optimal number of UAVs under given environmental conditions to minimize the waste of computing resources. Full article

(This article belongs to the Special Issue Advances in Intelligent Wireless Communications: AI, Optimization, and Beyond)

32 pages, 4400 KB

Open AccessArticle

Research on Space-Time Data Prediction Model of Quantum Long Short-Term Memory Network Fusion

by Bing Han, Jian Kang, Meng Zhang and Qian Wu

Photonics 2026, 13(5), 477; https://doi.org/10.3390/photonics13050477 (registering DOI) - 11 May 2026

Abstract

This study proposes a novel hybrid prediction model (QGCN-LSTM) that combines Quantum Graph Convolutional Networks (QGCN) with classical Long Short-Term Memory (LSTM). The model takes space-time data as input and employs a hierarchical graph-based quantum encoding strategy. Specifically, classical spatial features are first [...] Read more.

This study proposes a novel hybrid prediction model (QGCN-LSTM) that combines Quantum Graph Convolutional Networks (QGCN) with classical Long Short-Term Memory (LSTM). The model takes space-time data as input and employs a hierarchical graph-based quantum encoding strategy. Specifically, classical spatial features are first aggregated into critical regional hubs, which are then mapped into the Hilbert space through a dense quantum encoding layer. Multi-scale features are extracted through the collaborative computation of QGCN and quantum gated recurrent units, and a quantum attention module is introduced to dynamically screen key information. Finally, the prediction results are generated through quantum measurement and a classical output layer. In the space-time data prediction task of urban traffic flow, a benchmark model system covering classical, cutting-edge, and traditional architectures was constructed. The experimental results show that QGCN-LSTM utilizes quantum entanglement gates to establish non-local road network associations, dynamically allocate feature weights to enhance the impact of critical time steps, and achieves deep compression of lines through quantum line pruning technology, effectively alleviating the common problem of “poor plateau” in quantum neural network training. In terms of prediction accuracy, the mean absolute error (MAE) of its key hub nodes is reduced by 34.1% compared to the graph convolution LSTM (GCN-LSTM) model, and the Spatial Correlation Index (SCI) is improved to 0.89. In addition, it also shows excellent performance in dynamic response, edge computing efficiency, and other aspects, meeting the real-time requirements of the traffic signal control system. This study provides an effective paradigm for the application of quantum collaborative architecture in complex spatiotemporal prediction tasks. Full article

(This article belongs to the Special Issue Recent Progress in Quantum Communication)

► Show Figures

Figure 1

24 pages, 4450 KB

Open AccessArticle

Adaptive Multi-Strategy Particle Swarm Optimization Path Planning Algorithm for Multi-Terrain Post-Disaster Relay Rescue

by Jianhua Zhang, Shuaiqi Pang, Xiaohai Ren, Yong Zhang, Yuxin Du and Geng Na

Appl. Sci. 2026, 16(10), 4748; https://doi.org/10.3390/app16104748 (registering DOI) - 11 May 2026

Abstract

Post-disaster rescue scenarios often involve complex and variable terrains, imposing heterogeneous mobility requirements on different transport modes. Single-type vehicles face challenges in independently completing comprehensive rescue tasks. This study addresses the critical problem of coordinating heterogeneous aerial and ground vehicles to collaboratively plan [...] Read more.

Post-disaster rescue scenarios often involve complex and variable terrains, imposing heterogeneous mobility requirements on different transport modes. Single-type vehicles face challenges in independently completing comprehensive rescue tasks. This study addresses the critical problem of coordinating heterogeneous aerial and ground vehicles to collaboratively plan relay rescue routes. To tackle the NP hard multi-terrain, multi-vehicle, and multi-route path planning problem, we propose a New Adaptive Multi-Strategy Particle Swarm Optimization algorithm (AMS-PSO-NEW). The algorithm features a synergistic integration of differential evolution’s multi-strategy mutation, SHADE-based adaptive parameter control, population diversity monitoring with restart mechanisms, and multi-level local search. A sequential hybrid mechanism is designed in which DE-generated trial vectors serve as reference positions for PSO velocity updates, enabling balanced global exploration and local exploitation. By leveraging adaptive parameter tuning, success history memory, and diverse population maintenance, AMS-PSO-NEW effectively overcomes premature convergence and low accuracy issues typical in discrete combinatorial optimization using traditional PSO, achieving a balanced global exploration and local exploitation. Performance validation is conducted over six rescue scenarios varying in scale and complexity, benchmarking AMS-PSO-NEW against nine algorithms: PSO, GA, NSGA-II, GWO, DE, ABC, CS, Q-learning, and MIP. Results demonstrate superior performance across four metrics (rescue success rate, average rescue time, total cost, and fairness), with significant improvements in high-complexity environments. Full article

(This article belongs to the Section Electrical, Electronics and Communications Engineering)

► Show Figures

Figure 1

19 pages, 3662 KB

Open AccessArticle

Spatial–Spectral Bidirectional-Driven Collaborative Network with Coordinate-Aware and Spectral-Modulated Interaction for Hyperspectral Pansharpening

by Qingshan Gao, Conghui Tao, Xiongjun Du, Yanmin Zhu, Shixuan Liu, Xiuxiu Chen and Qiuxiao Chen

Sensors 2026, 26(10), 3009; https://doi.org/10.3390/s26103009 - 10 May 2026

Abstract

High-resolution hyperspectral computational imaging is critical for applications such as environmental monitoring, urban planning, and precision agriculture. In practical hyperspectral imaging systems, physical hardware constraints inevitably lead to coupled degradations across spatial and spectral dimensions, making it difficult to simultaneously achieve high spatial [...] Read more.

High-resolution hyperspectral computational imaging is critical for applications such as environmental monitoring, urban planning, and precision agriculture. In practical hyperspectral imaging systems, physical hardware constraints inevitably lead to coupled degradations across spatial and spectral dimensions, making it difficult to simultaneously achieve high spatial resolution and high spectral fidelity. As a representative and widely studied hyperspectral computational imaging task, hyperspectral pansharpening aims to reconstruct high-resolution hyperspectral images by integrating low-resolution hyperspectral images with high-resolution panchromatic images. Existing methods frequently suffer from spectral distortion or blurred spatial details due to unidirectional fusion strategies or isolated processing branches that inadequately model the intrinsic spatial–spectral coupling in the imaging process. To overcome these limitations, we propose a bidirectional driving framework that enables synergistic mutual guidance between spatial detail infusion and spectral fidelity preservation. Specifically, spatial coordinate-aware representations are dynamically integrated into a spectral self-attention module, while spectral importance scores are utilized to modulate multi-receptive-field convolutions via channel-wise weighting. This bidirectional interaction mechanism forms a closed-loop coupling between spatial and spectral representations, ensuring enhanced spatial reconstruction while rigorously preserving spectral integrity. Furthermore, to bridge the gap between simulated experiments and real-world applications, we constructed a large-scale dataset derived from the ZY-1-02D satellite. This dataset features high-fidelity PAN (17,820 × 16,128) and HSI (1485 × 1344) pairs, which we have made publicly available to the community to facilitate future research. Extensive experiments on both benchmark simulations and the proposed ZY-1-02D dataset demonstrate that our method achieves state-of-the-art performance in both spatial fidelity and spectral preservation. Full article

(This article belongs to the Special Issue Advanced Sensing Towards Sustainable Agro-Water Systems)

23 pages, 765 KB

Open AccessArticle

Hybrid Quantum–Classical Computing for Multi-Objective Resource Allocation in Elastic Optical Networks

by Bakhe Nleya and Beverly Pule

Photonics 2026, 13(5), 472; https://doi.org/10.3390/photonics13050472 (registering DOI) - 9 May 2026

Viewed by 68

Abstract

The rapid advancement of beyond-5G and 6G services is creating computational challenges that classical optimisation methods for Elastic Optical Networks (EONs) cannot effectively handle. Specifically, the multi-objective Routing and Spectrum Assignment (RSA) problem—aimed at minimising blocking probability, maximising spectral efficiency, and reducing fragmentation—poses [...] Read more.

The rapid advancement of beyond-5G and 6G services is creating computational challenges that classical optimisation methods for Elastic Optical Networks (EONs) cannot effectively handle. Specifically, the multi-objective Routing and Spectrum Assignment (RSA) problem—aimed at minimising blocking probability, maximising spectral efficiency, and reducing fragmentation—poses significant challenges and is NP-hard, particularly in dynamic traffic. This paper introduces a hybrid framework that combines quantum and classical computing, dividing the optimisation tasks into classical pre-processing, a quantum optimisation core, and classical post-processing with Pareto frontier management. The RSA problem is modelled using a Quadratic Unconstrained Binary Optimisation (QUBO) formulation that accounts for blocking, efficiency, and a quadratic fragmentation metric. Simulations conducted on NSFNET and UBN topologies under Poisson traffic conditions revealed that even in realistic, noisy quantum environments, this hybrid method reduces the blocking probability by 14% and improves fragmentation by 7.3% compared to the top classical heuristics. A scaling analysis indicates a key point of around 220 variables where this hybrid strategy surpasses traditional meta-heuristics in both solution quality and execution time, emphasising its significant potential in the current NISQ era. Full article

29 pages, 25329 KB

Open AccessArticle

WMC-DFINE: An Improved DFINE Model for Aluminum Profile Surface Defect Detection

by Pengfei He, Yunming Ding, Shuwen Yan, Guoheng Wang and Xia Liu

Sensors 2026, 26(10), 2994; https://doi.org/10.3390/s26102994 - 9 May 2026

Viewed by 186

Abstract

The automated inspection of aluminum profile surface defects, which heavily relies on data acquired by machine vision sensors, is a critical task in industrial quality control. Addressing the current challenges of intense background texture interference and the difficulty in detecting defects with extreme [...] Read more.

The automated inspection of aluminum profile surface defects, which heavily relies on data acquired by machine vision sensors, is a critical task in industrial quality control. Addressing the current challenges of intense background texture interference and the difficulty in detecting defects with extreme aspect ratios on aluminum profiles, this research puts forward a complete end-to-end defect detection algorithm named WMC-DFINE (WIFA-MKSS-CSFF-DFINE) based on the DFINE framework. First, a Wavelet-Integrated Frequency Attention (WIFA) module is introduced, which utilizes a discrete wavelet transform to decouple features into the frequency domain, thereby dynamically suppressing high-frequency background noise and enhancing defect edge responses. Second, a Cross-Scale Feature Fusion (CSFF) module based on dual-channel pooling is designed to ensure the continuity of defect features, thereby resolving the semantic misalignment issue in traditional fusion. Third, a Multi-Kernel Strip Shuffle (MKSS) module is incorporated, utilizing decomposed convolution kernels to capture the geometric features of slender scratches. Finally, a knowledge distillation strategy is employed to transfer structured knowledge from a complex teacher model to a lightweight student model. Experiments on the Tianchi aluminum defect dataset demonstrate that WMC-DFINE achieves a mAP of 82.1%, which surpasses algorithms including YOLOv12, RT-DETR, and the baseline model DFINE. Furthermore, the distilled student model, WMC-DFINE-distill, improves the mAP by 3.2% compared to DFINE, reduces parameter count by 47%, and achieves an inference speed of 59.75 FPS on the experimental equipment. The proposed method effectively resolves the problem of balancing background suppression and defect detail feature preservation, offering a practical and efficient scheme for real-time industrial defect inspection. Full article

(This article belongs to the Section Industrial Sensors)

36 pages, 2085 KB

Open AccessArticle

A Risk-Driven Maritime Patrol Route Optimization Framework for IUU Fishing Surveillance Using Multi-Source AIS and SAR Data Fusion

by Songtao Hu, Qianyue Zhang, Yiming Wang and Xiaokang Wang

J. Mar. Sci. Eng. 2026, 14(10), 878; https://doi.org/10.3390/jmse14100878 (registering DOI) - 9 May 2026

Viewed by 65

Abstract

Illegal, unreported, and unregulated (IUU) fishing threatens marine ecosystems in the Western Pacific. Conventional patrol strategies under-utilize the available multi-source surveillance data. This study proposes a maritime patrol-routing framework that integrates AIS fishing effort, Sentinel-1 SAR dark-vessel detections, and GFW vessel encounter records [...] Read more.

Illegal, unreported, and unregulated (IUU) fishing threatens marine ecosystems in the Western Pacific. Conventional patrol strategies under-utilize the available multi-source surveillance data. This study proposes a maritime patrol-routing framework that integrates AIS fishing effort, Sentinel-1 SAR dark-vessel detections, and GFW vessel encounter records into a Surveillance Priority Index (SPI) over the study domain (0–20°N, 140–160°E). An Adaptive Priority-Boosted Ant Colony Optimization (APB-ACO) algorithm with two-phase deadline-aware route construction and best-of-N adaptive strategy selection produces patrol routes that cover high-priority cells within a 72 h window while minimizing total distance. Across 30 random seeds and a benchmark suite (PB-ACO, GA, PSO, DQN, NSGA-II), APB-ACO yields the shortest mean route (

21, 658 \pm 9

km,

7 %

shorter than PB-ACO,

p < 0.001

), the lowest variance (

46 \times

lower standard deviation than PB-ACO), and 100% high-priority coverage at default settings; a scalability analysis across 2–20% high-priority task ratios shows that the coverage gap over PB-ACO widens with the HP ratio. The problem is also formalized as a Mixed-Integer Linear Program (Priority-Constrained VRPTW), positioning APB-ACO as a constructive metaheuristic for an NP-hard operational problem. The framework’s principal limitation is that, in the tested three-vessel scenario, the 500 km inter-vessel communication constraint is violated more than 1,100 times per 72 h mission and is repaired post hoc; integrating this constraint into the optimizer is identified as a near-term extension. The results provide a methodological foundation for surveillance-driven patrol planning rather than a validated tool for operational IUU interdiction. Full article

(This article belongs to the Section Ocean Engineering)

19 pages, 1940 KB

Open AccessArticle

SA-DSM-MADDPG for Multi-UAV Cooperative Encirclement in Obstacle-Rich Pursuit–Evasion Scenarios

by Qing Liang, Yujie Yang, Shihao Liang and Hui Li

Drones 2026, 10(5), 360; https://doi.org/10.3390/drones10050360 - 9 May 2026

Viewed by 61

Abstract

Multi-UAV cooperative encirclement in pursuit–evasion scenarios requires effective coordination under dynamic inter-agent interactions, sparse task feedback, and obstacle-constrained motion. While MADDPG offers a practical CTDE framework for multi-agent continuous control, its direct application to cooperative encirclement still faces challenges in modeling time-varying teammate [...] Read more.

Multi-UAV cooperative encirclement in pursuit–evasion scenarios requires effective coordination under dynamic inter-agent interactions, sparse task feedback, and obstacle-constrained motion. While MADDPG offers a practical CTDE framework for multi-agent continuous control, its direct application to cooperative encirclement still faces challenges in modeling time-varying teammate dependencies, selecting informative replay samples, and maintaining stable learning under delayed rewards. To address these challenges, we propose SA-DSM-MADDPG, an enhanced multi-agent deep deterministic policy gradient method that integrates the following: (i) a self-attention critic to model dynamic inter-agent relevance, (ii) a double-screened experience replay strategy combining prioritized sampling and relevance screening to improve replay quality, and (iii) curriculum learning with staged reward shaping to provide denser and more stable training signals. We evaluate the proposed method in 3v1 cooperative encirclement environments with static obstacles and varying initial conditions. Experimental results show that SA-DSM-MADDPG improves the success rate by approximately 22 percentage points over MADDPG and 35 percentage points over MAPPO, while also exhibiting faster convergence and better training stability. Full article

(This article belongs to the Section Artificial Intelligence in Drones (AID))

21 pages, 937 KB

Open AccessArticle

FDE-Mamba: Selective State Space Modeling for Personal Voice Activity Detection

by Chien-Chia Chiu, Tai-You Chen, Tzu-Wei Wang, Berlin Chen and Jeih-Weih Hung

Appl. Sci. 2026, 16(10), 4688; https://doi.org/10.3390/app16104688 (registering DOI) - 9 May 2026

Viewed by 62

Abstract

Voice Activity Detection (VAD) and Personal Voice Activity Detection (PVAD) are fundamental components in modern voice-based human–machine interaction systems. While VAD distinguishes speech from non-speech segments, PVAD further identifies whether the detected speech belongs to a specific target speaker, enabling more robust performance [...] Read more.

Voice Activity Detection (VAD) and Personal Voice Activity Detection (PVAD) are fundamental components in modern voice-based human–machine interaction systems. While VAD distinguishes speech from non-speech segments, PVAD further identifies whether the detected speech belongs to a specific target speaker, enabling more robust performance in multi-speaker environments. Recently, the Flexible Dynamic Encoder RNN (FDE-RNN) has demonstrated state-of-the-art performance on PVAD tasks by leveraging a detachable Personalization module (P-module) built upon a Dynamic Encoder RNN backbone. However, the Long Short-Term Memory (LSTM) networks employed throughout FDE-RNN inherently suffer from sequential processing constraints that prevent parallelization across time steps, and their fixed-size hidden state may restrict representational capacity for fine-grained speaker discrimination. In this paper, we propose FDE-Mamba, which replaces all three LSTM components in FDE-RNN—the Prediction RNN, the Encoder RNN, and the P-module temporal model—with independent Mamba blocks, each equipped with a selective state space mechanism and an expansion layer for enriched feature representation. The proposed architecture retains the weighted residual connection, FiLM-based speaker embedding fusion, and parallel training strategy of the original FDE-RNN without modification. Experimental results on the LibriSpeech corpus demonstrate that FDE-Mamba achieves a PVAD mAP of 0.9605, representing a 1.97% improvement over the reproduced FDE-RNN baseline (0.9419), along with an accuracy improvement from 86.85% to 89.87% and a

3.16 \times

reduction in real-time factor owing to the memory-efficient linear recurrences of the Mamba selective scan during inference, alongside its inherent parallelizability during training. Ablation studies further confirm that both the D skip connection and the expansion layer within each Mamba block contribute meaningfully to the observed performance gains, validating the effectiveness of each architectural design choice. These results suggest that Mamba is a compelling alternative to LSTM for temporal modeling in PVAD systems, and that the proposed integration provides a design blueprint for future selective SSM applications in gated PVAD architectures. Full article

(This article belongs to the Special Issue Application of Deep Learning in Speech Enhancement Technology)

24 pages, 13691 KB

Open AccessArticle

A Scene-Context-Aware Texture Privacy-Preserving Method for Photogrammetric 3D Urban Models

by Qianwen Zhou, Na Ren, Changqing Zhu and Jingyi Cai

Remote Sens. 2026, 18(10), 1468; https://doi.org/10.3390/rs18101468 - 8 May 2026

Viewed by 185

Abstract

Texture privacy preservation is a key technique for enabling the secure sharing and compliant use of three-dimensional (3D) urban models. However, maintaining visually continuous textures after privacy-preserving processing remains a challenging task due to the fragmented storage of textures in models. Considering the [...] Read more.

Texture privacy preservation is a key technique for enabling the secure sharing and compliant use of three-dimensional (3D) urban models. However, maintaining visually continuous textures after privacy-preserving processing remains a challenging task due to the fragmented storage of textures in models. Considering the continuity of texture representation within the 3D scene, this paper proposes a scene-context-aware texture privacy-preserving method for 3D urban models. Specifically, a texture fragmentation metric is first introduced to adaptively determine the optimal detection level for sensitive targets, enabling accurate localization and segmentation of sensitive regions. Subsequently, scene-level texture contextual information associated with the detected regions is incorporated to guide a scene-aware texture repair strategy, achieving spatially consistent texture reconstruction. Furthermore, a multi-level texture pyramid mapping mechanism is established to ensure texture representation consistency across different levels of detail of the model. Experimental results demonstrate that the proposed method improves post-preservation texture continuity by approximately 56.9–79.5% compared with representative approaches that rely on fragmented texture map, significantly enhancing the usability and reliability of privacy-preserving results. Overall, this work provides a novel technical framework for privacy-preserving processing of 3D urban models and offers new insights for secure data sharing in smart city applications. Full article

(This article belongs to the Section Earth Observation Data)

► Show Figures

Figure 1

24 pages, 7417 KB

Open AccessArticle

MSFE-Net: A Task-Oriented Optical–SAR Fusion Framework for Robust Industrial Object Detection

by Rufeng Guo, Rong Gui, Jun Hu, Pinjun Tang, Liang Cao, Jinghui Zhang and Qiao Jiang

Remote Sens. 2026, 18(10), 1466; https://doi.org/10.3390/rs18101466 - 8 May 2026

Viewed by 189

Abstract

Object detection in high-resolution remote sensing images under complex industrial environments is fundamentally constrained by the inherent limitations of single-modality sensors. Optical imagery is prone to background confusion and pseudo-target interference, while synthetic aperture radar (SAR) imagery suffers from speckle noise and structural [...] Read more.

Object detection in high-resolution remote sensing images under complex industrial environments is fundamentally constrained by the inherent limitations of single-modality sensors. Optical imagery is prone to background confusion and pseudo-target interference, while synthetic aperture radar (SAR) imagery suffers from speckle noise and structural ambiguity. This work investigates a critical evaluation gap in multimodal fusion, where traditional image-level quality metrics do not consistently reflect downstream detection performance. To address this issue, we propose a task-oriented framework termed the Multi-Source Fusion for Enhanced Object Detection Network (MSFE-Net). The proposed method integrates pixel-level optical–SAR fusion with a YOLOv11-based detector, enabling the learning of task-relevant representations by exploiting complementary optical spectral cues and SAR scattering characteristics. Extensive experiments are conducted across multiple fusion strategies and representative detection architectures on two industrial datasets covering oil tanks and photovoltaic arrays. The results consistently reveal a nonlinear decoupling between image-level fusion metrics and detection accuracy, indicating that improvements in global statistical image quality do not necessarily lead to superior task performance. Furthermore, the proposed framework demonstrates improved robustness in complex scenarios involving multi-scale and weak targets. Specifically, MSFE-Net achieves 99.1% mAP@50 for oil tank detection (19.5% improvement over SAR-only baselines) and 90.2% mAP@50 for photovoltaic array detection, with stable performance across different evaluation settings. These results highlight the importance of task-oriented evaluation in multimodal remote sensing fusion and suggest that downstream detection performance provides a more reliable criterion than conventional image-quality metrics. Full article

(This article belongs to the Special Issue Advances in Remote Sensing Image Target Detection and Recognition)

► Show Figures

Figure 1

Search Results (2,017)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Saved Queries

Search Filter Reset All

Years

Feature Papers

Subjects

Journals

Article Types

Countries / Regions

Search Results (2,017)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI