Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (186)

Search Parameters:
Keywords = multi-stage feature aggregation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
24 pages, 15151 KB  
Article
SG-YOLO: A Multispectral Small-Object Detector for UAV Imagery Based on YOLO
by Binjie Zhang, Lin Wang, Quanwei Yao, Keyang Li and Qinyan Tan
Remote Sens. 2026, 18(7), 1003; https://doi.org/10.3390/rs18071003 - 27 Mar 2026
Viewed by 211
Abstract
Object detection in unmanned aerial vehicle (UAV) imagery remains a crucial yet challenging task due to complex backgrounds, large scale variations, and the prevalence of small objects. Visible-spectrum images lack robustness under all-weather and all-illumination conditions; by contrast, multispectral sensing provides complementary cues [...] Read more.
Object detection in unmanned aerial vehicle (UAV) imagery remains a crucial yet challenging task due to complex backgrounds, large scale variations, and the prevalence of small objects. Visible-spectrum images lack robustness under all-weather and all-illumination conditions; by contrast, multispectral sensing provides complementary cues (e.g., thermal signatures) that improve detection robustness. However, existing multispectral solutions often incur high computational costs and are therefore difficult to deploy on resource-constrained UAV platforms. To address these issues, SG-YOLO is proposed, a lightweight and efficient multispectral object detection framework that aims to balance accuracy and efficiency. First, a Spectral Gated Downsampling Stem (SGDS) is designed, in which grouped convolutions and a gating mechanism are employed at the early stage of the network to extract band-specific features, thereby maximizing spectral complementarity while minimizing redundancy. Second, a Spectral–Spatial Iterative Attention Fusion (SSIAF) module is introduced, in which spectral-wise (channel) attention and spatial-wise attention are iteratively coupled and cascaded in a multi-scale manner to jointly model cross-band dependencies and spatial saliency, thereby aggregating high-level semantic information while suppressing redundant spectral responses. Finally, a Spatial–Channel Synergistic Fusion (SCSF) module is designed to enhance multi-scale and cross-channel feature integration in the neck. Experiments on the MODA dataset show that SG-YOLOs achieves 72.4% mAP50, outperforming the baseline by 3.2%. Moreover, compared with a range of mainstream one-stage detectors and multispectral detection methods, SG-YOLO delivers the best overall performance, providing an effective solution for UAV object detection while maintaining a favorable trade-off between model size and detection accuracy. Full article
Show Figures

Figure 1

28 pages, 657 KB  
Article
An Uncertainty-Aware Temporal Transformer for Probabilistic Interval Modeling in Wind Power Forecasting
by Shengshun Sun, Meitong Chen, Mafangzhou Mo, Xu Yan, Ziyu Xiong, Yang Hu and Yan Zhan
Sensors 2026, 26(7), 2072; https://doi.org/10.3390/s26072072 - 26 Mar 2026
Viewed by 360
Abstract
Under high renewable energy penetration, wind power forecasting faces pronounced challenges due to strong randomness and uncertainty, making conventional point-forecast-centric paradigms insufficient for risk-aware and reliable power system scheduling. An uncertainty-aware temporal transformer framework for wind power forecasting is presented, integrating probabilistic modeling [...] Read more.
Under high renewable energy penetration, wind power forecasting faces pronounced challenges due to strong randomness and uncertainty, making conventional point-forecast-centric paradigms insufficient for risk-aware and reliable power system scheduling. An uncertainty-aware temporal transformer framework for wind power forecasting is presented, integrating probabilistic modeling with deep temporal representation learning to jointly optimize prediction accuracy and uncertainty characterization. Crucially, rather than treating uncertainty quantification merely as a post-processing step, the central conceptual contribution lies in modularizing uncertainty directly within the attention mechanism. A probability-driven temporal attention mechanism is incorporated at the encoding stage to emphasize high-variability and high-risk time slices during feature aggregation, while a multi-quantile output and interval modeling strategy is adopted at the prediction stage to directly learn the conditional distribution of wind power, enabling simultaneous point and interval forecasts with statistical confidence. Extensive experiments on multiple public wind power datasets demonstrate that the proposed method consistently outperforms traditional statistical models, deep temporal models, and deterministic transformers, as validated by formal statistical significance testing. Specifically, the method achieves an MAE of 0.089, an RMSE of 0.132, and a MAPE of 10.84% on the test set, corresponding to reductions of approximately 8%10% relative to the deterministic transformer. In uncertainty evaluation, a PICP of 0.91 is attained while compressing the MPIW to 0.221 and reducing the CWC to 0.241, indicating a favorable balance between coverage reliability and interval compactness. Compared with mainstream probabilistic forecasting methods, the model further reduces RMSE while maintaining coverage levels close to the 90% target, effectively mitigating excessive interval conservatism. Moreover, by adaptively generating heteroscedastic intervals that widen during high-volatility events and narrow under stable conditions, the model achieves a highly focused and effective capture of critical uncertainty information. Full article
(This article belongs to the Special Issue Artificial Intelligence-Driven Sensing)
Show Figures

Figure 1

31 pages, 16969 KB  
Article
Research on Cooperative Vehicle–Infrastructure Perception Integrating Enhanced Point-Cloud Features and Spatial Attention
by Shiyang Yan, Yanfeng Wu, Zhennan Liu and Chengwei Xie
World Electr. Veh. J. 2026, 17(4), 164; https://doi.org/10.3390/wevj17040164 - 24 Mar 2026
Viewed by 177
Abstract
Vehicle–infrastructure cooperative perception (VICP) extends the sensing capability of single-vehicle systems by integrating multi-source information from onboard and roadside sensors, thereby alleviating limitations in sensing range and field-of-view coverage. However, in complex urban environments, the robustness of such systems—particularly in terms of blind-spot [...] Read more.
Vehicle–infrastructure cooperative perception (VICP) extends the sensing capability of single-vehicle systems by integrating multi-source information from onboard and roadside sensors, thereby alleviating limitations in sensing range and field-of-view coverage. However, in complex urban environments, the robustness of such systems—particularly in terms of blind-spot coverage and feature representation—is severely affected by both static and dynamic occlusions, as well as distance-induced sparsity in point cloud data. To address these challenges, a 3D object detection framework incorporating point cloud feature enhancement and spatially adaptive fusion is proposed. First, to mitigate feature degradation under sparse and occluded conditions, a Redefined Squeeze-and-Excitation Network (R-SENet) attention module is integrated into the feature encoding stage. This module employs a dual-dimensional squeeze-and-excitation mechanism operating across pillars and intra-pillar points, enabling adaptive recalibration of critical geometric features. In addition, a Feature Pyramid Backbone Network (FPB-Net) is designed to improve target representation across varying distances through multi-scale feature extraction and cross-layer aggregation. Second, to address feature heterogeneity and spatial misalignment between heterogeneous sensing agents, a Spatial Adaptive Feature Fusion (SAFF) module is introduced. By explicitly encoding the origin of features and leveraging spatial attention mechanisms, the SAFF module enables dynamic weighting and complementary fusion between fine-grained vehicle-side features and globally informative roadside semantics. Extensive experiments conducted on the DAIR-V2X benchmark and a custom dataset demonstrate that the proposed approach outperforms several state-of-the-art methods. Specifically, Average Precision (AP) scores of 0.762 and 0.694 are achieved at an IoU threshold of 0.5, while AP scores of 0.617 and 0.563 are obtained at an IoU threshold of 0.7 on the two datasets, respectively. Furthermore, the proposed framework maintains real-time inference performance, highlighting its effectiveness and practical potential for real-world deployment. Full article
(This article belongs to the Section Automated and Connected Vehicles)
Show Figures

Figure 1

24 pages, 3621 KB  
Article
Phase-Space Reconstruction and 2-D Fourier Descriptor Features for Appliance Classification in Non-Intrusive Load Monitoring
by Motaz Abu Sbeitan, Hussain Shareef, Madathodika Asna, Rachid Errouissi, Muhamad Zalani Daud, Radhika Guntupalli and Bala Bhaskar Duddeti
Energies 2026, 19(6), 1512; https://doi.org/10.3390/en19061512 - 18 Mar 2026
Viewed by 192
Abstract
Non-Intrusive Load Monitoring (NILM) enables appliance-level classification from aggregate electrical measurements and supports efficient energy management in smart buildings. However, the accuracy of existing NILM methods is often limited by the inability of conventional feature extraction techniques to capture nonlinear steady-state behavior. This [...] Read more.
Non-Intrusive Load Monitoring (NILM) enables appliance-level classification from aggregate electrical measurements and supports efficient energy management in smart buildings. However, the accuracy of existing NILM methods is often limited by the inability of conventional feature extraction techniques to capture nonlinear steady-state behavior. This study proposes a novel feature extraction framework for appliance classification, which integrates phase-space reconstruction (PSR) with 2-D Fourier series to derive geometry-based descriptors of appliance current waveforms. Unlike traditional signal-processing methods, the proposed approach utilizes the nonlinear geometric structure revealed by PSR and encodes it through Fourier descriptors, offering a discriminative, low-dimensional feature space suitable for classification using supervised machine learning algorithms. The method is evaluated on the high-resolution controlled single-appliance recordings from the COOLL dataset using the K-Nearest Neighbor (KNN) classifier. Extension to aggregated multi-appliance NILM scenarios would require additional stages such as event detection and load separation. Sensitivity analysis demonstrates that classification performance depends strongly on the choice of time delay and harmonic order, with optimal settings yielding an accuracy of up to 99.52% using KNN. The results confirm that larger time delays and a small number of harmonics effectively capture appliance-specific signatures. The findings highlight the effectiveness of PSR–Fourier-based geometric features as a robust alternative to conventional NILM feature extraction strategies. Full article
(This article belongs to the Special Issue Digital Engineering for Future Smart Cities)
Show Figures

Figure 1

15 pages, 5485 KB  
Article
DC Series Arc Fault Detection in Electric Vehicle Charging Systems Using a Temporal Convolution and Sparse Transformer Network
by Kai Yang, Shun Zhang, Rongyuan Lin, Ran Tu, Xuejin Zhou and Rencheng Zhang
Sensors 2026, 26(6), 1897; https://doi.org/10.3390/s26061897 - 17 Mar 2026
Viewed by 262
Abstract
In electric vehicle (EV) charging systems, DC series arc faults, due to their high concealment and severe hazard, have become one of the important causes of electric vehicle fire accidents. An improved hybrid arc fault model of a charging system was established in [...] Read more.
In electric vehicle (EV) charging systems, DC series arc faults, due to their high concealment and severe hazard, have become one of the important causes of electric vehicle fire accidents. An improved hybrid arc fault model of a charging system was established in Simulink for preliminary study. The results show that the high-frequency noise generated by arc faults affects the output voltage quality of the charger, and this noise is conducted to the battery voltage. Arc faults in a real electric vehicle charging experimental platform were further investigated, where it was found that, during arc fault events, the charging system provides no alarm indication, and the current signals exhibit significant large-amplitude random disturbances and nonlinear fluctuations. Moreover, under normal conditions during vehicle charging startup and the pre-charge stage, the current waveforms also present high-pulse spike characteristics similar to arc faults. Finally, a carefully designed deep neural network-based arc fault detection algorithm, Arc_TCNsformer, is proposed. The current signal samples are directly input into the network model without manual feature selection or extraction, enabling end-to-end fault recognition. By integrating a temporal convolutional network for multi-scale local feature extraction with a sparse Transformer for contextual information aggregation, the proposed method achieves strong robustness under complex charging noise environments. Experimental results demonstrate that the algorithm not only provides high detection accuracy but also maintains reliable real-time performance when deployed on embedded edge computing platforms. Full article
(This article belongs to the Special Issue Deep Learning Based Intelligent Fault Diagnosis)
Show Figures

Figure 1

23 pages, 2679 KB  
Article
Morphology-Aware Deep Features and Frozen Filters for Surgical Instrument Segmentation with LLM-Based Scene Summarization
by Adnan Haider, Muhammad Arsalan and Kyungeun Cho
J. Clin. Med. 2026, 15(6), 2227; https://doi.org/10.3390/jcm15062227 - 15 Mar 2026
Viewed by 214
Abstract
Background/Objectives: The rise of artificial intelligence is injecting intelligence into the healthcare sector, including surgery. Vision-based intelligent systems that assist surgical procedures can significantly increase productivity, safety, and effectiveness during surgery. Surgical instruments are central components of any surgical intervention, yet detecting and [...] Read more.
Background/Objectives: The rise of artificial intelligence is injecting intelligence into the healthcare sector, including surgery. Vision-based intelligent systems that assist surgical procedures can significantly increase productivity, safety, and effectiveness during surgery. Surgical instruments are central components of any surgical intervention, yet detecting and locating them during live surgeries remains challenging due to adverse imaging conditions such as blood occlusion, smoke, blur, glare, low-contrast, instrument scale variation, and other artifacts. Methods: To address these challenges, we developed an advanced segmentation architecture termed the frozen-filters-based morphology-aware segmentation network (FFMS-Net). Accurate surgical instrument segmentation strongly depends on edge and morphology information; however, in conventional neural networks, this spatial information is progressively degraded during spatial processing. FFMS-Net introduces a frozen and learnable feature pipeline (FLFP) that simultaneously exploits frozen edge representations and learnable features. Within FLFP, Sobel and Laplacian filters are frozen to preserve edge and orientation information, which is subsequently fused with learnable initial spatial features. Moreover, a tri-atrous blending (TAB) block is employed at the end of the encoder to fuse multi-receptive-field-based contextual information, preserving instrument morphology and improving robustness under challenging conditions such as blur, blood occlusion, and smoke. Datasets focused on surgical instruments often suffer from severe class imbalance and poor instrument visibility. To mitigate these issues, FFMS-Net incorporates a progressively structure-preserving decoder (PSPD) that aggregates dilated and standard spatial information after each upsampling stage to maintain class structure. Multi-scale spatial features from different encoder levels are further fused using light skip paths (LSPs) to project channels with task-relevant patterns. Results/Conclusions: FFMS-Net is extensively evaluated on three challenging datasets: UW-Sinus-surgery-live, UW-Sinus-cadaveric, and CholecSeg8k. The proposed method demonstrates promising performance compared with state-of-the-art approaches while requiring only 1.5 million trainable parameters. In addition, an open-source large language model is integrated for non-clinical summarization of the surgical scene based on the predicted mask and deterministic descriptors derived from it. Full article
(This article belongs to the Special Issue Artificial Intelligence and Machine Learning in Clinical Practice)
Show Figures

Figure 1

17 pages, 2130 KB  
Article
FogGate-YOLO: Traffic Object Detection in Foggy Environments Using Channel Selection Mechanisms
by Yuhe Yang, Suilian You, Jinpeng Yu and Bo Lu
Sensors 2026, 26(6), 1811; https://doi.org/10.3390/s26061811 - 13 Mar 2026
Viewed by 179
Abstract
To address the challenges posed by foggy conditions in object detection tasks, we propose FogGate-YOLO, an enhanced YOLOv8 framework designed for robust and efficient detection in foggy environments. Unlike traditional methods that rely on image dehazing or preprocessing enhancements, our approach directly strengthens [...] Read more.
To address the challenges posed by foggy conditions in object detection tasks, we propose FogGate-YOLO, an enhanced YOLOv8 framework designed for robust and efficient detection in foggy environments. Unlike traditional methods that rely on image dehazing or preprocessing enhancements, our approach directly strengthens the model’s feature representation by introducing two novel modules: GroupGatedConv and C2fGated. These modules collaboratively mitigate fog-induced degradation, improving feature extraction and enhancing performance without additional inference overhead. The GroupGatedConv module focuses on coarse-grained channel selection in the early to mid-stages of the backbone, suppressing noise while preserving essential structural features. The C2fGated module refines the aggregated features in both the backbone and neck after multi-branch fusion, enhancing fine-grained feature recalibration. Together, these two modules provide a hierarchical coarse to fine channel selection strategy that significantly improves the model’s discriminative power in foggy conditions. Full article
(This article belongs to the Topic Advances in Autonomous Vehicles, Automation, and Robotics)
Show Figures

Figure 1

17 pages, 3905 KB  
Article
UAV Multispectral Imagery Combined with Canopy Vertical Layering Information for Leaf Nitrogen Content Inversion in Cotton
by Kaixuan Li, Chunqi Yin, Yangbo Ye, Xueya Han and Sanmin Sun
Agronomy 2026, 16(6), 607; https://doi.org/10.3390/agronomy16060607 - 12 Mar 2026
Viewed by 284
Abstract
Leaf nitrogen concentration (LNC) exhibits pronounced vertical heterogeneity across canopy layers, which affects the accuracy of nitrogen diagnosis derived from UAV-based remote sensing imagery. To address the differential contributions of leaf nitrogen from distinct canopy strata and the limitations associated with single-source features, [...] Read more.
Leaf nitrogen concentration (LNC) exhibits pronounced vertical heterogeneity across canopy layers, which affects the accuracy of nitrogen diagnosis derived from UAV-based remote sensing imagery. To address the differential contributions of leaf nitrogen from distinct canopy strata and the limitations associated with single-source features, this study proposes an integrated framework that combines cumulative LNC indicators across canopy layers with multi-source feature sets (vegetation indices and texture features). Centered on three core technical innovations—(1) incorporating canopy-layer aggregation logic into LNC modeling, (2) integrating spectral and structural information through CNN-based feature fusion, and (3) combining deep feature extraction with gradient boosting regression to improve robustness under multi-stage conditions—the framework systematically evaluates three machine learning algorithms: Random Forest (RF), a Convolutional Neural Network–Extreme Gradient Boosting hybrid model (CNN_XGBoost), and K-Nearest Neighbor (KNN) for cotton LNC estimation across multiple growth stages. The results demonstrate that cumulative canopy-layer nitrogen indicators more effectively represent overall plant nitrogen status than single-layer measurements. The integration of multi-source features further enhances model performance. Under both single-variable inputs and combined VI–TF feature sets, the CNN_XGBoost model consistently outperforms the other models in calibration accuracy and stability across all growth stages. Its optimal performance occurs during the cotton flowering and boll stage, achieving a calibration R2 of 0.921. Overall, the proposed framework substantially improves the estimation accuracy of cotton LNC and provides both a theoretical foundation and technical support for precision nitrogen management and sustainable agricultural development. Full article
(This article belongs to the Section Precision and Digital Agriculture)
Show Figures

Figure 1

25 pages, 6369 KB  
Article
A Lightweight Attention-Guided and Geometry-Aware Framework for Robust Maritime Ship Detection in Complex Electro-Optical Environments
by Zhe Zhang, Chang Lin and Bing Fang
Automation 2026, 7(2), 48; https://doi.org/10.3390/automation7020048 - 12 Mar 2026
Viewed by 197
Abstract
Reliable ship detection in complex maritime optical imagery is a fundamental requirement for intelligent maritime monitoring and maritime automation systems. However, severe image degradation, large-scale variations, and background clutter often lead to feature ambiguity and unstable detection performance in real-world maritime environments. To [...] Read more.
Reliable ship detection in complex maritime optical imagery is a fundamental requirement for intelligent maritime monitoring and maritime automation systems. However, severe image degradation, large-scale variations, and background clutter often lead to feature ambiguity and unstable detection performance in real-world maritime environments. To address these challenges, this paper proposes a lightweight one-stage ship detection framework designed for robust real-time perception under degraded maritime sensing conditions. The proposed method incorporates an Adaptive Expert Selection Attention (AESA) mechanism to perform adaptive feature selection and background suppression under visually degraded conditions, together with a Geometry-Aware MultiScale Fusion (GAMF) module that enables orientation-aware aggregation of contextual information for elongated ship targets near complex sea–sky boundaries. In addition, a geometry-aware bounding box regression refinement is introduced to improve localization consistency in image space. Extensive experiments conducted on a unified real-world maritime benchmark demonstrate that the proposed framework consistently outperforms the baseline YOLO11n model by approximately 2–5 percentage points in terms of mAP@0.5 and mAP@0.5:0.95, while maintaining moderate computational complexity and real-time inference capability. These results indicate that the proposed method provides a practical and deployment-oriented perception solution for maritime automation applications, including onboard electro-optical sensing and coastal surveillance. Full article
Show Figures

Figure 1

20 pages, 4709 KB  
Article
Low-Contrast Coating Surface Microcrack Detection Using an Improved U-Net Network Based on Probability Map Fusion
by Junwen Xue, Wuzhi Chen, Shida Zhang, Xukun Yang, Keji Pang, Jiaojiao Ren, Lijuan Li and Haiyan Li
Sensors 2026, 26(5), 1629; https://doi.org/10.3390/s26051629 - 5 Mar 2026
Viewed by 188
Abstract
To address challenges such as low contrast, complex backgrounds, and discontinuous crack distribution in coating surface microcrack detection, a detection method combining circular neighborhood features with an improved U-net is proposed. In the preprocessing stage, a background template is constructed via median filtering, [...] Read more.
To address challenges such as low contrast, complex backgrounds, and discontinuous crack distribution in coating surface microcrack detection, a detection method combining circular neighborhood features with an improved U-net is proposed. In the preprocessing stage, a background template is constructed via median filtering, and crack contrast is enhanced through a combination of difference operations and Gaussian smoothing. Based on the spatial aggregation and directionality of crack pixels, multi-scale and multi-directional circular scanning filters were constructed to generate neighborhood difference maps for quantifying the crack distribution probability. The ImF-Att-DO-U-net was designed by utilizing a dual-channel input consisting of the original image and the crack probability map. The encoder embeds lightweight CBAMs to strengthen crack features, while the decoder introduces DO-Conv and Leaky ReLU to enhance detail capture capabilities. A hybrid loss function combining Binary Cross-Entropy and Dice loss was employed to optimize class imbalance. Algorithm testing results demonstrate that the proposed method achieved a Dice coefficient of 0.884, an SSIM of 0.893, and an accuracy of 0.911, outperforming comparative models such as DO-U-net. The extraction rate for cracks ≥10 μm reached 98%, with a minimum detectable crack size at the 7 μm level. The method exhibited excellent robustness under noise and blur testing, demonstrating superior environmental adaptability. Full article
Show Figures

Figure 1

25 pages, 2809 KB  
Article
Multi-Architecture Deep Learning for Early Alzheimer’s Detection in MRI: Slice- and Scan-Level Analysis
by Isabelle Bricaud and Giovanni Luca Masala
Int. J. Environ. Res. Public Health 2026, 23(3), 322; https://doi.org/10.3390/ijerph23030322 - 5 Mar 2026
Viewed by 567
Abstract
Alzheimer’s disease (AD), the most common form of dementia, is a progressive and irreversible neurodegenerative disorder. Structural MRI is widely used for diagnosis, revealing brain changes associated with AD. However, these alterations are often subtle and difficult to detect manually, particularly at early [...] Read more.
Alzheimer’s disease (AD), the most common form of dementia, is a progressive and irreversible neurodegenerative disorder. Structural MRI is widely used for diagnosis, revealing brain changes associated with AD. However, these alterations are often subtle and difficult to detect manually, particularly at early stages. Early intervention during prodromal stages, such as mild cognitive impairment (MCI), can help slow disease progression, highlighting the need for reliable automated methods. In this work, we introduce a dual-level evaluation framework comparing fifteen deep learning architectures, including convolutional neural networks (CNNs), Transformers, and hybrid models, for classifying AD, MCI, and cognitively normal (CN) subjects using the ADNI dataset. A central focus of our work is the impact of robust and standardized preprocessing pipelines, which we identified as a critical yet underexplored factor influencing model reliability. By evaluating performance at both slice-level and scan-level, we reveal that multi-slice aggregation affects architectures asymmetrically. By systematically optimizing preprocessing steps to reduce data variability and enhance feature consistency, we established preprocessing quality as an essential determinant of deep learning performance in neuroimaging. Experimental results show that CNNs and hybrid pre-trained models outperform Transformer-based models in both slice-level and scan-level classification. ConvNeXtV2-L achieved the best scan-level performance (91.07%), EfficientNetV2-L the highest slice-level accuracy (86.84%), and VGG19 balanced results (86.07%/88.52%). ConvNeXtV2-L and SwinV1-L exhibited scan-level improvements of 7.60% and 9.04% respectively, while EfficientNetV2-L experienced degradation of 2.66%, demonstrating that architectural selection and aggregation strategy are interdependent factors. These findings suggest that carefully designed preprocessing not only improves classification accuracy but may also serve as a foundation for more reproducible and interpretable Alzheimer’s disease detection pipelines. Full article
Show Figures

Figure 1

59 pages, 5629 KB  
Article
Adaptive Neural Network Method for Detecting Crimes in the Digital Environment to Ensure Human Rights and Support Forensic Investigations
by Serhii Vladov, Oksana Mulesa, Petro Horvat, Yevhen Kobko, Victoria Vysotska, Vasyl Kikinchuk, Serhii Khursenko, Kostiantyn Karaman and Oksana Kochan
Data 2026, 11(3), 49; https://doi.org/10.3390/data11030049 - 2 Mar 2026
Viewed by 441
Abstract
This article presents an adaptive neural network method for the automated detection, reconstruction, and prioritisation of multi-stage criminal operations in the digital environment, aiming to protect human rights and ensure the legal security of digital evidence. The developed method combines multimodal temporal encoders, [...] Read more.
This article presents an adaptive neural network method for the automated detection, reconstruction, and prioritisation of multi-stage criminal operations in the digital environment, aiming to protect human rights and ensure the legal security of digital evidence. The developed method combines multimodal temporal encoders, a graph module based on GNN for entity correlation, and a correlation head with a link-prediction mechanism and differentiable path recovery. Sliding time windows, logarithmic transformation of volumetric features, and pseudonymization of identifiers with the ability to utilise privacy-preserving procedures (federated learning, differential privacy) are used for data aggregation and normalisation. Unique features of the developed method include an integrated risk function combining an anomaly component and graph significance, a module for automated forensic packet generation with chain of custody recording, and a mechanism for incremental model updates. Experimental results demonstrate high diagnostic metric values (AUC ≈ 0.97, F1 ≈ 0.99 on the test dataset after balancing), robust recovery of priority paths (“path_probability” > 0.7 for top operations), and pipeline performance in PII leak prioritisation and human trafficking reconstruction scenarios. The study’s contribution lies in a practice-oriented neural network method that integrates detection, correlation, and the collection of legally applicable evidence. Full article
Show Figures

Figure 1

18 pages, 1168 KB  
Article
A Hybrid Deep Learning Model for Predicting Tuna Distribution Around Drifting Fish Aggregating Devices
by Bo Song, Jian Liu, Tianjiao Zhang and Quanjin Chen
Sustainability 2026, 18(5), 2406; https://doi.org/10.3390/su18052406 - 2 Mar 2026
Viewed by 254
Abstract
Accurate prediction of tuna distribution is essential for sustainable fisheries management. This study develops a two-stage hybrid model combining Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and Random Forest (RF) to predict tuna distribution around drifting fish aggregating devices (DFAD) in the [...] Read more.
Accurate prediction of tuna distribution is essential for sustainable fisheries management. This study develops a two-stage hybrid model combining Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), and Random Forest (RF) to predict tuna distribution around drifting fish aggregating devices (DFAD) in the Western and Central Pacific Ocean (WCPO). Echo-sounder buoy data from DFAD were aggregated into 2° × 2° grid cells and matched with oceanographic variables from the Copernicus Marine Service. Random Forest-based variable importance analysis identified primary productivity (27%), chlorophyll-a (22%), and dissolved oxygen (18%) as the three dominant environmental drivers. The CNN-RNN component extracts spatiotemporal features from multi-layer ocean data, while the RF classifier performs binary classification of tuna aggregation zones (high-yield vs. low-yield). All five models (Decision Tree, RF, CNN, Transformer, and CNN-RNN-RF) were evaluated on 557 samples using 5-fold stratified cross-validation, with each fold further split 80:20 for training and validation. The proposed CNN-RNN-RF model achieved the highest performance with an AUC of 0.830, accuracy of 82.6%, and F1-scores of 86.3% (high-yield) and 76.2% (low-yield), outperforming the best baseline model (RF: AUC 0.761, accuracy 75.4%). Predicted high-yield zones showed strong consistency with fishing log records, demonstrating the potential of integrating echo-sounder data with hybrid deep learning for data-driven tuna fisheries management. Full article
Show Figures

Figure 1

21 pages, 4260 KB  
Article
CMCLTrack: Reliability-Modulated Cross-Modal Adapter and Cross-Layer Mamba Fusion for RGB-T Tracking
by Pengfei Li, Xiaohe Li and Zide Fan
Electronics 2026, 15(5), 989; https://doi.org/10.3390/electronics15050989 - 27 Feb 2026
Viewed by 262
Abstract
Single-object tracking has progressed rapidly, yet it remains fragile under low illumination, occlusion, and background clutter. RGB-Thermal (RGB-T) tracking improves robustness via modality complementarity, yet many existing trackers do not dynamically switch the dominant modality as sensing quality changes and often rely on [...] Read more.
Single-object tracking has progressed rapidly, yet it remains fragile under low illumination, occlusion, and background clutter. RGB-Thermal (RGB-T) tracking improves robustness via modality complementarity, yet many existing trackers do not dynamically switch the dominant modality as sensing quality changes and often rely on simple late fusion at a single stage, underutilizing multi-level features across the backbone. To address these challenges, we propose CMCLTrack, a unified framework that integrates the Reliability-Modulated Cross-Modal Adapter (RMCA) and the Cross-Layer Mamba Fusion (CLMF). Specifically, RMCA performs reliability-aware bidirectional cross-modal interaction by dynamically weighting modality contributions, while CLMF efficiently aggregates complementary cues from multiple encoder layers to exploit multi-level representations. To stabilize the learning of layer-wise modality reliability, we additionally incorporate a cross-layer reliability smoothness regularization. Extensive experiments on multiple RGB-T tracking benchmarks demonstrate that CMCLTrack achieves competitive performance compared to existing state-of-the-art methods. Full article
(This article belongs to the Special Issue Advances in Multitarget Tracking and Applications)
Show Figures

Figure 1

28 pages, 11762 KB  
Article
A Coarse-to-Fine Optical-SAR Image Registration Algorithm for UAV-Based Multi-Sensor Systems Using Geographic Information Constraints and Cross-Modal Feature Consistency Mapping
by Xiaoyong Sun, Zhen Zuo, Xiaojun Guo, Xuan Li, Peida Zhou, Runze Guo and Shaojing Su
Remote Sens. 2026, 18(5), 683; https://doi.org/10.3390/rs18050683 - 25 Feb 2026
Viewed by 311
Abstract
Optical and synthetic aperture radar (SAR) image registration faces challenges from nonlinear radiometric distortions and geometric deformations caused by different imaging mechanisms. This paper proposes a coarse-to-fine registration algorithm integrating geographic information constraints with cross-modal feature consistency mapping. The coarse stage employs imaging [...] Read more.
Optical and synthetic aperture radar (SAR) image registration faces challenges from nonlinear radiometric distortions and geometric deformations caused by different imaging mechanisms. This paper proposes a coarse-to-fine registration algorithm integrating geographic information constraints with cross-modal feature consistency mapping. The coarse stage employs imaging geometry-based coordinate transformation with airborne navigation data to eliminate scale and rotation differences. The fine stage constructs a multi-scale phase congruency-based feature response aggregation model combined with rotation-invariant descriptors and global-to-local search for sub-pixel alignment. Experiments on integrated airborne optical/SAR datasets demonstrate superior performance with an average RMSE of 2.00 pixels, outperforming both traditional handcrafted methods (3MRS, OS-SIFT, POS-GIFT, GLS-MIFT) and state-of-the-art deep learning approaches (SuperGlue, LoFTR, ReDFeat, SAROptNet) while reducing execution time by 37.0% compared with the best-performing baseline. The proposed coarse registration also serves as an effective preprocessing module that improves SuperGlue’s matching rate by 167% and LoFTR’s by 109%, with a hybrid refinement strategy achieving 1.95 pixels RMSE. The method demonstrates robust performance under challenging conditions, enabling real-time UAV-based multi-sensor fusion applications. Full article
Show Figures

Figure 1

Back to TopTop