Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (2,856)

Search Parameters:
Keywords = hierarchical features

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
42 pages, 16476 KB  
Article
PIMSEL: A Physically Guided Multi-Modal Semi-Supervised Learning Framework for Earthquake-Induced Landslide Reactivation Risk Assessment
by Bingxin Shi, Hongmei Guo, Zongheng He, Shi Chen, Jia Guo, Yunxi Dong, Bingyang Shi, Jingren Zhou, Yusen He and Huajin Li
Remote Sens. 2026, 18(9), 1320; https://doi.org/10.3390/rs18091320 (registering DOI) - 25 Apr 2026
Abstract
Earthquake-induced landslide reactivation poses a sustained hazard for years following major seismic events, yet operational prediction remains constrained by heterogeneous multi-modal data, sparse supervision, and the absence of uncertainty-aware frameworks. This paper presents PIMSEL, a physically guided multi-modal semi-supervised framework for post-seismic landslide [...] Read more.
Earthquake-induced landslide reactivation poses a sustained hazard for years following major seismic events, yet operational prediction remains constrained by heterogeneous multi-modal data, sparse supervision, and the absence of uncertainty-aware frameworks. This paper presents PIMSEL, a physically guided multi-modal semi-supervised framework for post-seismic landslide reactivation risk assessment. PIMSEL integrates satellite-derived morphological features, precipitation time series, and seismic hazard attributes through four components: entropy-regularized optimal transport for cross-modal semantic alignment without paired supervision; causally constrained hierarchical fusion enforcing domain-consistent modal weighting; scenario-based prototype mutation for semi-supervised learning from sparse expert annotations; and prototype-anchored variational graph clustering that simultaneously stratifies landslides into HIGH, MEDIUM, and LOW risk tiers and produces decomposed aleatoric and epistemic uncertainty estimates for operational triage. The HIGH risk tier operationally corresponds to predicted reactivation, validated against 598 documented reactivation events across 7482 co-seismic landslides from three Sichuan Province earthquake sequences: the 2013 Lushan (Mw 7.0), 2017 Jiuzhaigou (Mw 7.0), and 2022 Luding (Mw 6.8) events. PIMSEL achieves 82.5% reactivation recall and 66.4% precision, outperforming twelve baselines across clustering quality, classification, and uncertainty calibration metrics. Ablation studies confirm that optimal transport alignment contributes the largest individual performance gain. Current limitations include quarterly assessment frequency and dependence on optical imagery under cloud cover, which future integration of real-time meteorological triggers and SAR data should address. Full article
22 pages, 5563 KB  
Article
A Spectrum-Driven Hierarchical Learning Network for Aero-Engine Defect Segmentation
by Yining Xie, Aoqi Shen, Haochen Qi, Jing Zhao, Jianpeng Li, Xichun Pan and Anlong Zhang
Computation 2026, 14(5), 99; https://doi.org/10.3390/computation14050099 (registering DOI) - 25 Apr 2026
Abstract
Aero-engine defects often exhibit micro-scale and high-frequency characteristics under complex metallic textures, which makes precise segmentation difficult. Most existing pixel-level methods rely on spatial-domain modeling and lack frequency-domain decoupling. As a result, high-frequency details are easily hidden by low-frequency background information. In addition, [...] Read more.
Aero-engine defects often exhibit micro-scale and high-frequency characteristics under complex metallic textures, which makes precise segmentation difficult. Most existing pixel-level methods rely on spatial-domain modeling and lack frequency-domain decoupling. As a result, high-frequency details are easily hidden by low-frequency background information. In addition, repeated downsampling weakens the representation of fine-grained structures, leading to inaccurate boundary localization and limited robustness. To address these issues, a spectrum-driven hierarchical learning network is proposed for aero-engine defect segmentation. First, a dual-band spectral module is constructed using the discrete cosine transform to separate high-frequency and low-frequency components, providing stable and physically meaningful frequency-domain priors for the network. Second, a detail-guided module is designed where high-frequency features adaptively guide skip connections, compensating information loss during encoding and improving boundary recovery. Furthermore, a low-frequency-driven region-aware modeling module is developed. The internal defect regions, boundary areas, and background regions are modeled hierarchically. A dynamic hyper-kernel generation mechanism performs region-sensitive convolutional modeling, improving adaptation to complex structural variations. Extensive experiments on the Turbo19 and NEU-Seg datasets demonstrate that the proposed method produces accurate defect boundaries and achieves mIoU scores of 89.82% and 91.44%, improving over the second-best method by 5.22% and 4.42%, respectively. Full article
(This article belongs to the Section Computational Engineering)
Show Figures

Figure 1

16 pages, 4351 KB  
Article
Representation-Centric Deep Learning for Multi-Class, Multi-Organ Histopathology Image Classification
by Li Hao and Ma Ning
Algorithms 2026, 19(5), 336; https://doi.org/10.3390/a19050336 (registering DOI) - 25 Apr 2026
Abstract
Imaging-based multi-omics derived from digital histopathology provides a valuable approach for characterizing tumor heterogeneity from routine clinical specimens. However, robust multi-cancer histopathological analysis remains challenging due to pronounced intra-tumor variability, inter-organ morphological overlap, and sensitivity to staining and acquisition variations, which can limit [...] Read more.
Imaging-based multi-omics derived from digital histopathology provides a valuable approach for characterizing tumor heterogeneity from routine clinical specimens. However, robust multi-cancer histopathological analysis remains challenging due to pronounced intra-tumor variability, inter-organ morphological overlap, and sensitivity to staining and acquisition variations, which can limit the generalizability of deep learning models. These limitations are largely driven by insufficient representation learning, particularly in multi-organ and multi-class diagnostic settings. In this study, we propose a hierarchically regularized representation learning framework for multi-cancer histopathological image analysis that models imaging-based features across multiple organs and diagnostic categories. The framework integrates complementary mechanisms to capture fine-grained cellular morphology, long-range tissue architecture, and organ-aware diagnostic semantics within a unified computational model. A hierarchical supervision strategy guides the network to reduce entanglement between organ-level structural characteristics and disease-specific diagnostic patterns in the learned representations. The method operates without pixel-level annotations or handcrafted morphological priors, supporting scalable experimental evaluation. We demonstrate the approach on balanced lung and colon cancer histopathology cohorts, achieving 96.5% accuracy on lung cancer classification and 96.8% accuracy on colon cancer classification. Ablation and robustness analyses further validate the contributions of hierarchical regularization and consistency learning. Overall, this work provides a demonstrated proof-of-concept framework for representation-centric imaging-based analysis in multi-organ histopathology under the evaluated dataset conditions. Full article
17 pages, 2710 KB  
Article
DPA-HiVQA: Enhancing Structured Radiology Reporting with Dual-Path Cross-Attention
by Ngoc Tuyen Do, Minh Nguyen Quang and Hai Van Pham
Mach. Learn. Knowl. Extr. 2026, 8(5), 113; https://doi.org/10.3390/make8050113 (registering DOI) - 24 Apr 2026
Abstract
Structured radiology reporting can improve clinical decision support by standardizing clinical findings into hierarchical formats. However, thousands of questions in structured report templates about clinical findings are prohibitively time-consuming, which can limit clinical adoption. Furthermore, early medical VQA datasets primarily focused on free-text [...] Read more.
Structured radiology reporting can improve clinical decision support by standardizing clinical findings into hierarchical formats. However, thousands of questions in structured report templates about clinical findings are prohibitively time-consuming, which can limit clinical adoption. Furthermore, early medical VQA datasets primarily focused on free-text and independent question–answer pairs while a recent dataset, Rad-ReStruct, introduced a hierarchical VQA, but the accompanying model still relies heavily on flattened embedding representations and single-path text–image fusion mechanisms that inadequately handle complex hierarchical dependencies in responses. In this paper, we propose DPA-HiVQA (Dual-Path Cross-Attention for Hierarchical VQA), addressing these limitations through two key contributions: (1) multi-scale image embedding representing global semantic embeddings with patch-level spatial features from domain-specific BioViL encoder; (2) dual-path cross-attention mechanism enabling simultaneous holistic semantic understanding and fine-grained spatial reasoning. Evaluated on the Rad-ReStruct benchmark, the model substantially outperforms the established benchmark baseline with an overall F1-score and Level 3 F1-score improvement by 21.2% and 31.9%, respectively. The proposed model demonstrates that dual-path cross-attention architectures can effectively connect holistic semantic understanding and fine-grained spatial detail, paving the way for practical AI-assisted structured reporting systems that reduce radiologist burden while maintaining diagnostic accuracy. Full article
31 pages, 2203 KB  
Article
Hierarchical Multi-View Representation Learning via Generalized Deep Non-Negative Matrix Factorization
by Hubo Tan, Yuan Wan, Guoqing Luo and Zaichun Sun
Mathematics 2026, 14(9), 1442; https://doi.org/10.3390/math14091442 - 24 Apr 2026
Abstract
Multi-view clustering aims to exploit complementary information from multiple views to uncover intrinsic grouping structures in data, where effective representation learning plays a critical role. Non-negative matrix factorization (NMF) has been widely used for multi-view representation learning due to its inherent interpretability; however, [...] Read more.
Multi-view clustering aims to exploit complementary information from multiple views to uncover intrinsic grouping structures in data, where effective representation learning plays a critical role. Non-negative matrix factorization (NMF) has been widely used for multi-view representation learning due to its inherent interpretability; however, most existing NMF-based methods rely on shallow architectures and are therefore insufficient for capturing hierarchical characteristics. Although recent deep NMF models introduce multi-layer structures by factorizing either feature matrices or basis matrices, their performance may degrade when the data are limited or exhibit relatively simple structures. To address these issues, this paper proposes a generalized deep non-negative matrix factorization framework for multi-view representation learning, termed GDNMF-MRL, which jointly decomposes feature and basis matrices to learn hierarchical representations. By integrating shallow linear components with deep nonlinear structures, the proposed method enhances representation capability and yields more discriminative latent subspaces. Furthermore, a one-step variant, termed OS-GDNMF-MRL, is developed to simultaneously learn latent representations and clustering assignments within a unified optimization framework, enabling direct interaction between representation learning and clustering without requiring separate post-processing. Two efficient alternating optimization algorithms with guaranteed convergence of the objective function are derived, and extensive experiments on benchmark datasets demonstrate that the proposed methods consistently outperform several state-of-the-art multi-view clustering approaches. Full article
(This article belongs to the Section E: Applied Mathematics)
31 pages, 22857 KB  
Article
Congestion-Aware Adaptive Routing Based on Graph Attention Networks and Dynamic Cost Optimization
by Jun Liu, Xinwei Li and Lingyun Zhou
Symmetry 2026, 18(5), 719; https://doi.org/10.3390/sym18050719 - 24 Apr 2026
Abstract
To mitigate local congestion and address the adaptability limitations of traditional static routing under dynamic traffic, this paper proposes an end-to-end routing method based on a Graph Attention Network (GAT), termed Congestion-Aware Graph Attention Routing (CA-GAR). To alleviate the issue of local optima [...] Read more.
To mitigate local congestion and address the adaptability limitations of traditional static routing under dynamic traffic, this paper proposes an end-to-end routing method based on a Graph Attention Network (GAT), termed Congestion-Aware Graph Attention Routing (CA-GAR). To alleviate the issue of local optima in traditional heuristic iterative optimization, we design a dynamic link cost optimization algorithm with multi-start parallel exploration. This algorithm employs a ”penalty–reselection–reward” closed-loop feedback mechanism, performing global searches from multiple random initial states to generate a high-quality, empirically near-optimal cost matrix as supervised labels. Building on this, CA-GAR leverages a multi-head attention mechanism to adaptively aggregate high-order topological features of nodes and edges, and incorporates a staged hierarchical hyperparameter optimization strategy to map real-time network states to link costs. Simulation results demonstrate that CA-GAR outperforms traditional static routing under light, medium, and heavy loads. Under high-load burst conditions, the method exhibits effective congestion avoidance capability, reducing end-to-end delay by approximately 50% and lowering the packet loss rate to as low as 2%. Compared with QLRA, CA-GAR shows promising performance in multi-path traffic splitting and possesses robust fast rerouting capabilities during node failures, thereby achieving intelligent traffic distribution and global load balancing. Full article
(This article belongs to the Special Issue Symmetry in Computational Intelligence and Data Science)
20 pages, 5677 KB  
Article
Robust Image Watermarking via Clustered Visual State-Space Modeling
by Bo Liu and Jianhua Ren
Appl. Sci. 2026, 16(9), 4166; https://doi.org/10.3390/app16094166 - 24 Apr 2026
Abstract
Most existing DNN-based image watermarking methods adopt an “encoder–noise–decoder” paradigm, where the watermark is typically replicated and expanded in a straightforward manner and then directly fused with image features, which limits robustness under complex distortions. Although Transformers improve fusion via attention mechanisms, their [...] Read more.
Most existing DNN-based image watermarking methods adopt an “encoder–noise–decoder” paradigm, where the watermark is typically replicated and expanded in a straightforward manner and then directly fused with image features, which limits robustness under complex distortions. Although Transformers improve fusion via attention mechanisms, their quadratic computational complexity makes high-resolution processing prohibitively expensive. To address these issues, we propose CCViM, a robust watermarking framework built on Vision Mamba, which leverages the linear-complexity property of state-space models (SSMs) to enable efficient global interactions. We design a Watermark Representation Learning Module (WRLM) that performs hierarchical feature extraction and structured expansion of the watermark through cascaded VSS blocks, yielding semantically rich and perturbation-resistant watermark representations. In addition, we introduce an Interwoven Fusion Enhancement Module (IFEM), which employs a CCS6 structure to treat the watermark as a dynamic guidance signal. By combining contextual clustering with the Mamba mechanism, IFEM deeply interweaves the watermark into host features at both local and global levels. Experiments on COCO, DIV2K, and ImageNet demonstrate that CCViM consistently improves imperceptibility, robustness, and efficiency to varying degrees, and remains stable and high quality under attacks such as JPEG compression, cropping, and Gaussian blur. Full article
(This article belongs to the Special Issue Advanced Pattern Recognition & Computer Vision, 2nd Edition)
Show Figures

Figure 1

23 pages, 2175 KB  
Article
Semantic Segmentation of Sparse Array-SAR 3D Point Clouds Using an Enhanced PointNet++ Framework
by Ya Shu, Lei Pang and Miao Li
Appl. Sci. 2026, 16(9), 4149; https://doi.org/10.3390/app16094149 - 23 Apr 2026
Abstract
The semantic segmentation of sparse array synthetic aperture radar (SAR) 3D point clouds remains a significant challenge. These datasets are characterized by extreme sparsity, irregular distribution, and structural discontinuity, factors that diminish the reliability of local neighborhoods and impede the performance of traditional [...] Read more.
The semantic segmentation of sparse array synthetic aperture radar (SAR) 3D point clouds remains a significant challenge. These datasets are characterized by extreme sparsity, irregular distribution, and structural discontinuity, factors that diminish the reliability of local neighborhoods and impede the performance of traditional segmentation algorithms. This study introduces an enhanced PointNet++ framework specifically tailored for the semantic segmentation of sparse array-SAR 3D point clouds. Utilizing PointNet++ as a hierarchical backbone, the proposed architecture incorporates three geometry-oriented modifications: a feature enhancement strategy integrating normalized height, surface normals, and local density; an EdgeConv module positioned at an intermediate abstraction stage to reinforce local geometric modeling; and an FP-Refine module designed to optimize cross-scale feature propagation and recovery within sparse regions. Rather than proposing a fundamentally distinct universal architecture, this research focuses on a task-oriented adaptation of PointNet++ to address the neighborhood instability and structural gaps inherent in sparse array-SAR data. Experimental evaluations using the SARMV3D-1.0 dataset indicate that the proposed method consistently outperforms the PointNet++ baseline, maintaining stable performance across various random seeds with an mIoU between 55% and 58%. Further validation through ablation studies, parameter sensitivity analyses, and perturbation-based robustness assessments confirms the utility of the integrated components. Additionally, cross-dataset experiments on S3DIS and Toronto3D suggest that the framework generalizes effectively to point clouds with varying densities and spatial configurations. The findings demonstrate that the method is particularly successful for categories defined by distinct vertical geometry and structural continuity, such as trees, roofs, and facades, though performance remains limited for weakly structured classes like roads. Full article
24 pages, 7452 KB  
Article
Time-Series Clustering Leveraging Inter-Network Heterogeneity from a Spectral Symmetry Perspective
by Xiaolei Zhang, Qun Liu, Qi Li, Dehui Wang and Hongguang Jia
Symmetry 2026, 18(5), 713; https://doi.org/10.3390/sym18050713 - 23 Apr 2026
Abstract
Time-series clustering is a prominent research area with extensive practical applications. Given the complexity and diversity of modern time-series data, this study proposes a novel time-series clustering method based on inter-network heterogeneity. First, each time-series is converted into a network by using two [...] Read more.
Time-series clustering is a prominent research area with extensive practical applications. Given the complexity and diversity of modern time-series data, this study proposes a novel time-series clustering method based on inter-network heterogeneity. First, each time-series is converted into a network by using two types of time-series segmentation techniques. Second, an inter-network clustering approach based on graph spectral theory is introduced: we calculate the total variation (TV) distance between the empirical spectral distributions of each network and identify distinct clusters using a hierarchical clustering algorithm. From the perspective of symmetry, networks constructed from similar time-series tend to exhibit comparable spectral structures, which reflect the underlying structural symmetries of their dynamics. Differences in spectral distributions correspond to symmetry breaking among networks, providing an effective mechanism for distinguishing heterogeneous time-series patterns. Our method effectively preserves more distinctive features inherent in the original time-series. To evaluate the performance of the proposed method, simulation studies are conducted, including the recognition of both stationary and non-stationary sequences. The method also performs well on real-world datasets, such as stock closing prices. These results demonstrate that our approach can handle non-stationary sequences and identify the intrinsic correlations in time-series. Full article
22 pages, 5140 KB  
Article
Application of Deep Multi-Scale Representation Learning Based on Eye-Tracking and Facial Expression Data in Cognitive Decline Assessment
by Yanfeng Xue, Xianpeng Luo, Shuai Guo and Tao Song
Sensors 2026, 26(9), 2600; https://doi.org/10.3390/s26092600 - 23 Apr 2026
Abstract
Digital biomarkers derived from eye-tracking and facial expression hold significant potential for the non-invasive screening of cognitive decline (CD). However, existing approaches predominantly rely on single-task or feature engineering-based unimodal methods, which struggle to capture complex temporal behavioral patterns. While deep learning (DL) [...] Read more.
Digital biomarkers derived from eye-tracking and facial expression hold significant potential for the non-invasive screening of cognitive decline (CD). However, existing approaches predominantly rely on single-task or feature engineering-based unimodal methods, which struggle to capture complex temporal behavioral patterns. While deep learning (DL) excels at extracting hierarchical features and intricate temporal dynamics from behavioral sequences, its application in this specific multimodal sensing domain remains exploratory. Addressing this gap, this study designed an assessment system integrating five multi-dimensional cognitive paradigms and collected eye-tracking and facial expression data from 20 healthy controls (HC) and 20 individuals with CD. For these multimodal sequences, we propose a deep neural network capable of multi-scale representation learning. By utilizing subspace exploration and multi-scale convolutions, this architecture extracts deep representations directly from data and incorporates a decision fusion mechanism to enhance diagnostic robustness. Experimental results demonstrate that our method achieves a 90% classification accuracy, outperforming machine learning models. Furthermore, statistical analyses conducted in this study validated several features associated with CD and also explored some novel potential behavioral patterns. This study confirms the feasibility of a DL framework based on eye-tracking and facial expression signals for identifying CD, providing a reference for developing objective and efficient digital screening tools. Full article
(This article belongs to the Section Biomedical Sensors)
Show Figures

Figure 1

23 pages, 47800 KB  
Article
AIGC-Driven Short Video Generation Based on the Controllable Multimodal Fusion Architecture
by Yan Zhu, Wei Li, Caixia Fan and Lu Yu
Electronics 2026, 15(9), 1783; https://doi.org/10.3390/electronics15091783 - 22 Apr 2026
Viewed by 161
Abstract
The utilization of Artificial Intelligence-Generated Content (AIGC) has attracted widespread attention in video content creation. To generate high-quality videos, this paper presents a controllable multimodal fusion architecture for AIGC-driven short-video production. This architecture employs hierarchical constraint mechanisms and a multimodal attention fusion mechanism [...] Read more.
The utilization of Artificial Intelligence-Generated Content (AIGC) has attracted widespread attention in video content creation. To generate high-quality videos, this paper presents a controllable multimodal fusion architecture for AIGC-driven short-video production. This architecture employs hierarchical constraint mechanisms and a multimodal attention fusion mechanism to enhance video content coherence and user controllability. Specifically, a scene coherence scheme is first designed to construct graph-based global and transition-level constraints by integrating text descriptions, reference images, and audio features. By leveraging the extracted style vector data, preliminary video clips are then generated through a combination of the cross-modal fusion unit and the spatio-temporal consistency unit. Finally, a fine-grained adjustment mechanism is implemented to ensure logical consistency and stylistic uniformity in the AIGC-generated videos. Experimental results indicate that the proposed architecture improves generation quality, controllability, and cross-segment coherence under the adopted evaluation settings. Full article
Show Figures

Figure 1

26 pages, 14981 KB  
Article
Dynamic Conflict Footprints and Land-System Transformation in Large-Scale Mining: Evidence from Las Bambas, Peru
by Soledad Espezúa, Rodrigo Caballero, Álvaro Talavera and Luciano Stucchi
Land 2026, 15(5), 698; https://doi.org/10.3390/land15050698 - 22 Apr 2026
Viewed by 108
Abstract
Socio-environmental conflicts in mining regions are often examined through political, economic, or social lenses, while the role of land-system transformation remains less integrated into quantitative analysis. This study examines the co-evolution of socio-environmental conflict and territorial change in Las Bambas (Apurímac, Peru) as [...] Read more.
Socio-environmental conflicts in mining regions are often examined through political, economic, or social lenses, while the role of land-system transformation remains less integrated into quantitative analysis. This study examines the co-evolution of socio-environmental conflict and territorial change in Las Bambas (Apurímac, Peru) as a socio-territorial process. Annual conflict records from the Peruvian Ombudsman’s Office (2007–2024) were combined with annual land-cover data from MapBiomas. Yearly conflict influence zones were reconstructed from reported affected communities and geographic features using buffered spatial entities and concave hull polygons. Clustering methods (K-medoids, DBSCAN, and agglomerative hierarchical clustering) and FP-Growth association rule mining were applied to 23 unique conflicts consolidated from the original records and encoded with 10 root causes. The most intense conflict phases were accompanied by measurable landscape transformations, including the emergence of mining-related land cover from 2012 onward, sustained loss of high-Andean natural vegetation, expansion of agricultural mosaics, urban growth along the Apurímac–Cusco corridor, and hydrological alterations in wetlands and headwaters. Three conflict typologies were identified, with unfulfilled company commitments emerging as the most recurrent co-occurring grievance. The dynamic polygon approach offers a replicable framework for linking conflict records with land-system change in extractive regions. Full article
(This article belongs to the Section Land Systems and Global Change)
37 pages, 3754 KB  
Article
A Multi-UAV Cooperative Decision-Making Method in Dynamic Aerial Interaction Environments Based on GA-GAT-PPO
by Maoming Zou, Zhengyu Guo, Jian Zhang, Yu Han, Caiyi Chen, Huimin Chen and Delin Luo
Drones 2026, 10(5), 313; https://doi.org/10.3390/drones10050313 - 22 Apr 2026
Viewed by 101
Abstract
Autonomous task assignment in multi-unmanned aerial vehicle (UAV) systems operating in dynamic and safety-critical airspace environments is highly challenging due to complex spatial interactions and rapidly changing relative geometries. This paper proposes a hierarchical decision-making framework that bridges individual maneuvering behaviors with cooperative [...] Read more.
Autonomous task assignment in multi-unmanned aerial vehicle (UAV) systems operating in dynamic and safety-critical airspace environments is highly challenging due to complex spatial interactions and rapidly changing relative geometries. This paper proposes a hierarchical decision-making framework that bridges individual maneuvering behaviors with cooperative task allocation in multi-agent aerial systems. First, a high-fidelity single-agent maneuver model is learned using a physics-consistent simulation environment, where spatial advantage is evaluated based on relative distance and angular relationships within a kinematically feasible interaction zone (KIZ). Subsequently, a Geometry-Aware Graph Attention Network (GA-GAT) is developed to address scalable multi-agent assignment problems. Unlike conventional approaches that rely on flat feature representations, the proposed method explicitly incorporates kinematic feasibility constraints into the attention mechanism via a novel gating module, enabling efficient relational reasoning under dynamic conditions. The proposed framework is applicable to a range of civilian and safety-oriented scenarios, including UAV swarm coordination, emergency response monitoring, infrastructure inspection, and autonomous airspace management. Simulation results demonstrate that the GA-GAT-based approach significantly outperforms heuristic baselines in terms of coordination efficiency and overall system performance in complex multi-agent environments. This study highlights that decoupling maneuver-level control from high-level coordination provides a scalable and computationally efficient solution for real-time multi-UAV decision-making in safety-critical applications. The proposed framework is designed for general multi-agent coordination problems in civilian aerial applications. Full article
(This article belongs to the Special Issue UAV Swarm Intelligent Control and Decision-Making)
25 pages, 19124 KB  
Article
Multi-Scale Fractional-Order Image Fusion Algorithm Based on Polarization Spectral Images
by Zhenduo Zhang, Xueying Cao and Zhen Wang
Appl. Sci. 2026, 16(9), 4087; https://doi.org/10.3390/app16094087 - 22 Apr 2026
Viewed by 75
Abstract
With the continuous advancement of polarization spectral sensing technology, multi-band polarization image fusion has emerged as a novel approach to image fusion. By integrating spectral and polarization information, this method overcomes the limitations of relying on a single information source and significantly improves [...] Read more.
With the continuous advancement of polarization spectral sensing technology, multi-band polarization image fusion has emerged as a novel approach to image fusion. By integrating spectral and polarization information, this method overcomes the limitations of relying on a single information source and significantly improves overall image quality. To address this, this paper proposes a new polarization spectral fusion algorithm. First, feature matching is employed to achieve pixel-level spatial alignment of multi-band polarization images. Then, a fusion strategy based on multi-scale decomposition and singular value decomposition is adopted to preserve structural information and fine details. Subsequently, fractional-order processing and guided filtering are applied to enhance details and suppress noise. Finally, a progressive reconstruction from low to high scales is performed to ensure hierarchical consistency and information integrity throughout the fusion process. In addition, spectral information is utilized for color restoration, enabling the final image to achieve high spatial resolution while maintaining natural and rich color representation.Experimental results demonstrate that the proposed method effectively integrates features from different spectral bands and polarization information while preserving maximum similarity, leading to significant improvements in both image quality and detail representation. Full article
24 pages, 2996 KB  
Article
A Multi-Scale Temporal Representation-Enhanced Informer for Wastewater Effluent Quality Prediction
by Juan Wu, Yifan Wu, Yongze Liu and Xiaoyu Zhang
Appl. Sci. 2026, 16(9), 4078; https://doi.org/10.3390/app16094078 - 22 Apr 2026
Viewed by 80
Abstract
Accurate prediction of effluent water quality is essential for the intelligent and sustainable operation of wastewater treatment plants (WWTPs). However, this task remains challenging due to the strong nonlinearity, long-term temporal dependencies, and severe fluctuations inherent in influent characteristics. In this study, a [...] Read more.
Accurate prediction of effluent water quality is essential for the intelligent and sustainable operation of wastewater treatment plants (WWTPs). However, this task remains challenging due to the strong nonlinearity, long-term temporal dependencies, and severe fluctuations inherent in influent characteristics. In this study, a novel data-driven framework termed the Multi-Scale Temporal Representation-Enhanced Informer (MTRE-Informer), is proposed to predict key effluent quality indicators, including total nitrogen (TN), total phosphorus (TP), and chemical oxygen demand (COD). To ensure data quality and computational efficiency, a generative recurrent learning framework is first employed for anomaly detection and correction, followed by variance inflation factor (VIF)-based feature selection to mitigate multicollinearity. Furthermore, feature contribution analysis is conducted to improve model interpretability. Subsequently, the core MTRE-Informer architecture utilizes hierarchical multi-scale temporal representation learning to simultaneously capture local patterns and long-term dependencies within the complex dynamics of the wastewater treatment process. Experimental results demonstrate that the MTRE-Informer achieves robust and stable predictive performance across diverse operational datasets. For TN prediction, the proposed framework attains a coefficient of determination () of 0.9637 and a mean absolute percentage error (MAPE) of 3.39%. Compared with baseline approaches, the improvement ranges from 3.8% to 14.2%, validating its superior capability. To further enhance model robustness, an anomaly detection and correction strategy based on a generative recurrent learning framework is employed. In addition, feature contribution analysis and VIF-based feature selection are conducted to improve interpretability, mitigate multicollinearity, and enhance computational efficiency. Overall, this framework provides a reliable and practical solution for real-time effluent quality prediction, facilitating the intelligent management of WWTPs. Full article
Back to TopTop