Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

Search Results (556)

Search Parameters:
Keywords = multi-scale deep feature representation

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
29 pages, 5347 KB  
Article
Optimized Reinforcement Learning-Driven Model for Remote Sensing Change Detection
by Yan Zhao, Zhiyun Xiao, Tengfei Bao and Yulong Zhou
J. Imaging 2026, 12(3), 139; https://doi.org/10.3390/jimaging12030139 - 19 Mar 2026
Abstract
In recent years, deep learning has driven remarkable progress in remote sensing change detection (CD); however, practical deployment is still hindered by two limitations. First, CD results are easily degraded by imaging-induced uncertainties—mixed pixels and blurred boundaries, radiometric inconsistencies (e.g., shadows and seasonal [...] Read more.
In recent years, deep learning has driven remarkable progress in remote sensing change detection (CD); however, practical deployment is still hindered by two limitations. First, CD results are easily degraded by imaging-induced uncertainties—mixed pixels and blurred boundaries, radiometric inconsistencies (e.g., shadows and seasonal illumination changes), and slight residual misregistration—leading to pseudo-changes and fragmented boundaries. Second, prevailing methods follow a static one-pass inference paradigm and lack an explicit feedback mechanism for adaptive error correction, which weakens generalization in complex or unseen scenes. To address these issues, we propose a feedback-driven CD framework that integrates a dual-branch U-Net with deep reinforcement learning (RL) for pixel-level probabilistic iterative refinement of an initial change probability map. The backbone produces a preliminary posterior estimate of change likelihood from multi-scale bi-temporal features, while a PPO-based RL agent formulates refinement as a Markov decision process. The agent leverages a state representation that fuses multi-scale features, prediction confidence/uncertainty, and spatial consistency cues (e.g., neighborhood coherence and edge responses) to apply multi-step corrective actions. From an imaging and interpretation perspective, the RL module can be viewed as a learnable, self-adaptive imaging optimization mechanism: for high-risk regions affected by blurred boundaries, radiometric inconsistencies, and local misalignment, the agent performs feedback-driven multi-step corrections to improve boundary fidelity and spatial coherence while suppressing pseudo-changes caused by shadows and illumination variations. Experiments on four datasets (CDD, SYSU-CD, PVCD, and BRIGHT) verify consistent improvements. Using SiamU-Net as an example, the proposed RL refinement increases mIoU by 3.07, 2.54, 6.13, and 3.1 points on CDD, SYSU-CD, PVCD, and BRIGHT, respectively, with similarly consistent gains observed when the same RL module is integrated into other representative CD backbones. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

24 pages, 26483 KB  
Article
E-STNet: A Non-Ideal Array DOA Estimation Method Based on Enhanced Spatio-Temporal Features
by Haiqin Zhao, Jian Gong and Changlin Zhou
Electronics 2026, 15(6), 1270; https://doi.org/10.3390/electronics15061270 - 18 Mar 2026
Viewed by 32
Abstract
To address the challenge of degraded DOA estimation performance under array errors and low signal-to-noise ratio conditions, this paper proposes an Enhanced Spatio-Temporal Network (E-STNet). This network adopts a dual-source input architecture. By integrating multi-scale pooling and a hybrid Long Short-Term Memory-Transformer (LSTM-Transformer) [...] Read more.
To address the challenge of degraded DOA estimation performance under array errors and low signal-to-noise ratio conditions, this paper proposes an Enhanced Spatio-Temporal Network (E-STNet). This network adopts a dual-source input architecture. By integrating multi-scale pooling and a hybrid Long Short-Term Memory-Transformer (LSTM-Transformer) encoder, the network jointly refines spatial feature representations and captures multi-granularity temporal dependencies. Simulation results demonstrate that, under challenging scenarios such as array errors, low Signal-to-Noise Ratio (SNR), and closely spaced sources, E-STNet achieves higher estimation accuracy and stronger robustness than conventional algorithms and existing deep learning methods, providing an effective solution for DOA estimation in complex environments. Full article
(This article belongs to the Special Issue Advances in Radar Signal Processing Technology and Its Application)
Show Figures

Figure 1

26 pages, 3627 KB  
Article
Multi-Radio Access Fusion with Contrastive Graph Message Passing Neural Networks for Intelligent Maritime Routing
by Xuan Zhou, Jin Chen and Haitao Lin
Electronics 2026, 15(6), 1268; https://doi.org/10.3390/electronics15061268 - 18 Mar 2026
Viewed by 60
Abstract
Maritime heterogeneous wireless networks are characterized by dynamic topology and significant heterogeneity in bandwidth, latency, and coverage across communication paradigms, rendering traditional terrestrial routing protocols inadequate. To address these challenges, this paper proposes a unified multi-radio access fusion infrastructure featuring a gateway that [...] Read more.
Maritime heterogeneous wireless networks are characterized by dynamic topology and significant heterogeneity in bandwidth, latency, and coverage across communication paradigms, rendering traditional terrestrial routing protocols inadequate. To address these challenges, this paper proposes a unified multi-radio access fusion infrastructure featuring a gateway that enables protocol conversion and collaborative resource management across heterogeneous systems. Building upon this infrastructure, we introduce CMPGNN-DQN, an intelligent routing algorithm that integrates Contrastive Message Passing Graph Neural Networks with Deep Reinforcement Learning. Specifically, the algorithm employs k-hop neighbor aggregation to expand the receptive field for routing decisions, and utilizes a dual-view contrastive learning mechanism—encompassing both homogeneous and heterogeneous perspectives—to enhance representation robustness against dynamic topology perturbations. By deeply fusing network topology features with real-time state information, including bandwidth, delay, and queue length, the agent makes hop-by-hop routing decisions via an ε-greedy policy within the DQN framework. Extensive simulations conducted across various scales of dynamic maritime communication scenarios demonstrate that CMPGNN-DQN outperforms state-of-the-art benchmark algorithms, including AODV, DQN, and GCN, across key metrics such as packet delivery ratio, transmission latency, and bandwidth utilization. Quantitatively, compared to the best-performing alternative (MPNN-DQN), our algorithm achieves throughput improvements of 2.06–5.04% under standard traffic loads and 6.6–27.1% under partial link failure conditions, while converging within merely 25 training episodes. Notably, under heavy network loads (40% load rate) or partial link failures, the algorithm maintains stable communication performance, demonstrating strong adaptability to complex dynamic environments. Full article
Show Figures

Figure 1

24 pages, 2611 KB  
Article
MF-DFA–Enhanced Deep Learning for Robust Sleep Disorder Classification from EEG Signals
by Abdulaziz Alorf
Fractal Fract. 2026, 10(3), 199; https://doi.org/10.3390/fractalfract10030199 - 18 Mar 2026
Viewed by 101
Abstract
Sleep disorders are prevalent in the world, and they lead to severe health issues such as cardiovascular disease and cognitive disabilities. Conventional polysomnography-based diagnosis is based on manual EEG analysis under the supervision of trained specialists, which is time-consuming and may have inter-rater [...] Read more.
Sleep disorders are prevalent in the world, and they lead to severe health issues such as cardiovascular disease and cognitive disabilities. Conventional polysomnography-based diagnosis is based on manual EEG analysis under the supervision of trained specialists, which is time-consuming and may have inter-rater variability. Although the predictions of deep learning (DL) models on the task of sleep classification of EEG have been promising, they, in many cases, do not explain the multiscale, temporal dynamics that physiological signals are characterized by. In this work, a hybrid model that is a combination of CNN and multifractal detrended fluctuation analysis (MF-DFA) was proposed to detect localized temporal features and long-term fractal-based dynamics of single-channel EEG recordings. The performance of the suggested model was tested using two separate polysomnographic datasets: the CAP Sleep Dataset of five-class sleep disorder classification (Healthy, Insomnia, Narcolepsy, PLM, and RBD) and the ISRUC Sleep Dataset on the three-class subject-independent validation. In the CAP dataset, the framework had an accuracy of 86.38%. Cross-dataset transfer to the ISRUC Sleep Dataset, where only the classification head was fine-tuned on a small labeled subset while all feature-extraction layers remained frozen from CAP training, achieved 87.50% accuracy, demonstrating that the learned representations generalize across differing recording protocols, sampling rates, and diagnostic label spaces. The experiments of ablation proved the paramount importance of the MF-DFA features, and the lack of them led to low classification rates. The findings demonstrate the clinical feasibility of applying fractal analysis in conjunction with DL to detect sleep disorders in an automated, generalizable manner, suitable for use in large-scale monitoring and resource-starved clinical environments. Full article
(This article belongs to the Special Issue Fractals in Physiology and Medicine)
Show Figures

Figure 1

23 pages, 12740 KB  
Article
SAM2-RoadNet: Topology-Aware Multi-Scale Road Extraction from High-Resolution Remote Sensing Images
by Ruyue Feng, Ziyou Guo, Xiao Du and Tieru Wu
Remote Sens. 2026, 18(6), 913; https://doi.org/10.3390/rs18060913 - 17 Mar 2026
Viewed by 163
Abstract
Road extraction from high-resolution remote sensing images (HRSIs) is a fundamental task for many geospatial applications, yet it remains challenging due to complex backgrounds, frequent occlusions, and the requirement to preserve the topological connectivity of elongated road networks. To address these issues, this [...] Read more.
Road extraction from high-resolution remote sensing images (HRSIs) is a fundamental task for many geospatial applications, yet it remains challenging due to complex backgrounds, frequent occlusions, and the requirement to preserve the topological connectivity of elongated road networks. To address these issues, this paper proposes SAM2-RoadNet, a topology-aware multi-scale road extraction framework that adapts the powerful representation capability of the Segment Anything Model 2 (SAM2) to HRSI road segmentation. Unlike prompt-driven segmentation paradigms, SAM2-RoadNet employs the SAM2 image encoder solely as a feature extractor and introduces an adapter-based domain adaptation strategy to efficiently transfer pretrained knowledge to the remote sensing domain. Receptive field blocks are further integrated to enhance contextual perception and align channel dimensions, followed by a weighted bidirectional feature pyramid network (W-BiFPN) to fuse hierarchical features across multiple scales. Moreover, a topology-aware training strategy based on the soft-clDice loss is incorporated to explicitly enforce structural continuity and reduce road fragmentation. Extensive experiments conducted on two challenging benchmarks, including DeepGlobe, Massachusetts, demonstrate that SAM2-RoadNet achieves superior overall performance across multiple evaluation metrics compared with state-of-the-art methods in both quantitative accuracy and qualitative visual quality, while demonstrating promising cross-dataset transferability without additional fine-tuning. Full article
(This article belongs to the Section Remote Sensing Image Processing)
Show Figures

Figure 1

31 pages, 23615 KB  
Article
A Memory-Efficient Class-Incremental Learning Framework for Remote Sensing Scene Classification via Feature Replay
by Yunze Wei, Yuhan Liu, Ben Niu, Xiantai Xiang, Jingdun Lin, Yuxin Hu and Yirong Wu
Remote Sens. 2026, 18(6), 896; https://doi.org/10.3390/rs18060896 - 15 Mar 2026
Viewed by 151
Abstract
Most existing deep learning models for remote sensing scene classification (RSSC) adopt an offline learning paradigm, where all classes are jointly optimized on fixed-class datasets. In dynamic real-world scenarios with streaming data and emerging classes, such paradigms are inherently prone to catastrophic forgetting [...] Read more.
Most existing deep learning models for remote sensing scene classification (RSSC) adopt an offline learning paradigm, where all classes are jointly optimized on fixed-class datasets. In dynamic real-world scenarios with streaming data and emerging classes, such paradigms are inherently prone to catastrophic forgetting when models are incrementally trained on new data. Recently, a growing number of class-incremental learning (CIL) methods have been proposed to tackle these issues, some of which achieve promising performance by rehearsing training data from previous tasks. However, implementing such strategy in real-world scenarios is often challenging, as the requirement to store historical data frequently conflicts with strict memory constraints and data privacy protocols. To address these challenges, we propose a novel memory-efficient feature-replay CIL framework (FR-CIL) for RSSC that retains compact feature embeddings, rather than raw images, as exemplars for previously learned classes. Specifically, a progressive multi-scale feature enhancement (PMFE) module is proposed to alleviate representation ambiguity. It adopts a progressive construction scheme to enable fine-grained and interactive feature enhancement, thereby improving the model’s representation capability for remote sensing scenes. Then, a specialized feature calibration network (FCN) is trained in a transductive learning paradigm with manifold consistency regularization to adapt stored feature descriptors to the updated feature space, thereby effectively compensating for feature space drift and enabling a unified classifier. Following feature calibration, a bias rectification (BR) strategy is employed to mitigate prediction bias by exclusively optimizing the classifier on a balanced exemplar set. As a result, this memory-efficient CIL framework not only addresses data privacy concerns but also mitigates representation drift and classifier bias. Extensive experiments on public datasets demonstrate the effectiveness and robustness of the proposed method. Notably, FR-CIL outperforms the leading state-of-the-art CIL methods in mean accuracy by margins of 3.75%, 3.09%, and 2.82% on the six-task AID, seven-task RSI-CB256, and nine-task NWPU-45 datasets, respectively. At the same time, it reduces memory storage requirements by over 94.7%, highlighting its strong potential for real-world RSSC applications under strict memory constraints. Full article
Show Figures

Figure 1

18 pages, 5377 KB  
Article
Prediction of Prestress Changes in Concrete Under Freeze–Thaw Cycles Based on Transformer Model
by Jiancheng Zhang, Xiaolin Yang and Wen Zhang
Eng 2026, 7(3), 133; https://doi.org/10.3390/eng7030133 - 14 Mar 2026
Viewed by 148
Abstract
Given that freeze–thaw damage of prestressed concrete significantly threatens structural service life and that existing conventional simulation techniques fail to capture prestress time series, this paper proposes a deep learning prediction model based on the Transformer model. The model integrates a multi-head self-attention [...] Read more.
Given that freeze–thaw damage of prestressed concrete significantly threatens structural service life and that existing conventional simulation techniques fail to capture prestress time series, this paper proposes a deep learning prediction model based on the Transformer model. The model integrates a multi-head self-attention mechanism and positional encoding to effectively capture long-range dependencies in prestressed time series. It enhances temporal modeling capability through a 128-dimensional high-dimensional feature space (chosen to balance representation capacity and computational efficiency for the dataset scale) and a 4-layer encoder stacking structure. A dataset was constructed using time-series data from three prestressed concrete components subjected to 50 freeze–thaw cycles. The F-a component was used as the training set, while F-b and F-c served as the testing sets. During the training phase, a Noam learning rate scheduler, gradient clipping, and an early stopping strategy were employed. The results indicate that the training strategy enables the loss function to converge quickly without overfitting, demonstrating good generalization performance. The prediction model performs well on the F-a and F-c datasets, with determination coefficients (R2) of 0.8404 and 0.8425, and corresponding Mean Absolute Error (MAE) of 61.71 MPa and 57.41 MPa, respectively. It can accurately track the periodic variation trend of prestress, demonstrating the model’s effectiveness in prestress prediction. This model provides a new technical tool for the health monitoring and performance prediction of prestressed concrete structures in freeze–thaw environments. Full article
(This article belongs to the Section Chemical, Civil and Environmental Engineering)
Show Figures

Figure 1

21 pages, 11196 KB  
Article
CR-MAT: Causal Representation Learning for Few-Shot Non-Intrusive Load Monitoring
by Xianglong Li, Shengxin Kong, Jiani Zeng, Hanqi Dai, Lu Zhang, Weixian Wang, Zihan Zhang and Liwen Xu
Electronics 2026, 15(6), 1195; https://doi.org/10.3390/electronics15061195 - 13 Mar 2026
Viewed by 192
Abstract
Non-intrusive load monitoring (NILM) is a key enabler for smart-grid applications, yet practical deployment is often hindered by limited appliance-level labels and severe distribution shifts across households and operating conditions. As a result, many deep learning approaches become unreliable in small-sample and out-of-distribution [...] Read more.
Non-intrusive load monitoring (NILM) is a key enabler for smart-grid applications, yet practical deployment is often hindered by limited appliance-level labels and severe distribution shifts across households and operating conditions. As a result, many deep learning approaches become unreliable in small-sample and out-of-distribution (OOD) settings. In this paper, we propose CR-MAT, a causality-driven representation learning framework for few-shot NILM classification. Instead of relying on large-scale training or heavy data augmentation, CR-MAT injects causal representation learning into multi-appliance task modeling, encouraging the network to learn appliance-discriminative features that are stable across environments while suppressing spurious, domain-specific correlations. We conduct extensive experiments under multiple OOD scenarios and consistently observe improved classification robustness compared with deep NILM baselines. Further analysis indicates that causal representation learning enhances resilience to non-stationary consumption patterns and improves generalization under OOD scenarios. The proposed framework provides a practical route toward reliable NILM classification and supports downstream smart-grid applications such as flexible load control and demand response. Full article
Show Figures

Figure 1

31 pages, 6867 KB  
Article
Field-Scale Detection of Rice Bacterial Leaf Blight Using UAV-Based Multispectral Imagery: Via Cross-Scale Sample-Label Transfer and Spatial–Spectral Feature Fusion
by Huiqin Ma, Zhiqin Gui, Yujin Jing, Dongmei Chen, Dayang Li, Dong Shen and Jingcheng Zhang
Remote Sens. 2026, 18(6), 880; https://doi.org/10.3390/rs18060880 - 13 Mar 2026
Viewed by 239
Abstract
Accurate field-scale crop disease detection is crucial for precise decisions and for highly efficient multi-scale collaboration. UAV-based multispectral imaging technology offers advantages in terms of high efficiency and low cost. Deep learning shows potential for deep representation and fusion of spectral and spatial [...] Read more.
Accurate field-scale crop disease detection is crucial for precise decisions and for highly efficient multi-scale collaboration. UAV-based multispectral imaging technology offers advantages in terms of high efficiency and low cost. Deep learning shows potential for deep representation and fusion of spectral and spatial features. However, traditional manual disease surveys are limited by efficiency and cost, making it difficult to meet the large sample sizes required by deep learning. Therefore, we proposed a method for rice bacterial leaf blight detection using UAV-based multispectral imagery. This method integrates a cross-scale sample-label transfer, and a spectral–spatial dual-branch feature fusion architecture (DualRiceNet). We first used RTK positioning to transfer disease labels from near-ground RGB images to high-altitude multispectral images, effectively expanding the dataset and alleviating the scarcity of labeled samples. DualRiceNet employed a cross-attention mechanism to couple its spectral and spatial branches, thereby isolating disease-specific spatial–spectral patterns from complex interference from the farmland background. DualRiceNet achieved an overall accuracy (OA) of 92.3% on the same-distribution test set. In an independent scenario test set spanning multiple differences in geography, time, phenology, and variety, the model maintained the highest OA of 80.0%. Our method demonstrated an excellent generalization ability to real-world environmental variations in rice fields. Full article
Show Figures

Figure 1

21 pages, 4501 KB  
Article
YOLOv8n-ALC: An Efficient Network for Bolt-Nut Fastener Detection in Complex Substation Environments
by Dazhang You, Fangke Li, Sicheng Wang and Yepeng Zhang
Appl. Sci. 2026, 16(6), 2716; https://doi.org/10.3390/app16062716 - 12 Mar 2026
Viewed by 148
Abstract
Bolt-nut fasteners are critical components of substation equipment, and their integrity directly affects the operational reliability of power systems. In practical inspection scenarios, however, the small physical scale of bolt-nut fasteners, together with complex background structures, often obscures their discriminative visual features, making [...] Read more.
Bolt-nut fasteners are critical components of substation equipment, and their integrity directly affects the operational reliability of power systems. In practical inspection scenarios, however, the small physical scale of bolt-nut fasteners, together with complex background structures, often obscures their discriminative visual features, making accurate automated detection particularly challenging. Reliable detection is a prerequisite for downstream tasks such as loosening identification and defect diagnosis. To address these challenges, this paper proposes YOLOv8n-ALC, an enhanced detection network built upon the lightweight YOLOv8n framework. The backbone is redesigned by integrating the AdditiveBlock from CAS-ViT and a Convolutional Gated Linear Unit (CGLU) to strengthen fine-grained feature extraction and suppress background interference without increasing computational burden. In addition, an improved Large Separable Kernel Attention (LSKA) module is introduced to expand the effective receptive field while maintaining efficiency, enabling more robust multi-scale feature representation. To further alleviate feature degradation of small bolt-nut fasteners in deep layers, a Context-Guided Reconstruction Feature Pyramid Network (CGRFPN) is employed in the neck to optimize cross-layer feature fusion and enhance localization accuracy. Experimental results demonstrate that YOLOv8n-ALC achieves an mAP@0.5 of 92.1%, with precision and recall of 93.5% and 87.1%, respectively, outperforming the baseline by clear margins. These results confirm the effectiveness and robustness of the proposed method for intelligent substation inspection and bolt-nut fastener condition monitoring. Full article
(This article belongs to the Section Computing and Artificial Intelligence)
Show Figures

Figure 1

16 pages, 1192 KB  
Article
Multi-Scale Feature Mixing of Language Model Embeddings for Enhanced Prediction of Submitochondrial Protein Localization
by Rong Wang, Menghua Wang, Yibo Wu, Lixiang Yang and Xiao Wang
Algorithms 2026, 19(3), 212; https://doi.org/10.3390/a19030212 - 11 Mar 2026
Viewed by 139
Abstract
Accurate prediction of submitochondrial localization is fundamental to understanding mitochondrial biogenesis and cellular metabolic pathways. While deep representations from pre-trained protein language models (pLMs) have significantly advanced the field, traditional global average pooling methods often fail to capture critical, localized N-terminal targeting signals, [...] Read more.
Accurate prediction of submitochondrial localization is fundamental to understanding mitochondrial biogenesis and cellular metabolic pathways. While deep representations from pre-trained protein language models (pLMs) have significantly advanced the field, traditional global average pooling methods often fail to capture critical, localized N-terminal targeting signals, particularly in long sequences where these motifs are mathematically diluted. To resolve this “signal dilution” bottleneck, we developed a multi-scale architecture that explicitly integrates high-resolution N-terminal features with global evolutionary context derived from ESM-2 embeddings. The proposed framework utilizes an orthogonal mixing strategy consisting of Token-mixing and Channel-mixing. Token-mixing is specifically designed to detect spatial rhythmic patterns across residue positions, while Channel-mixing refines the biochemical signatures within the latent feature space. Extensive benchmarking across diverse datasets demonstrates that our approach effectively maintains signal integrity. Compared to existing state-of-the-art methods, the model achieves a superior overall Generalized Correlation Coefficient (GCC) of 0.7443 on the SM424-18 dataset and 0.7878 on the SubMitoPred dataset, outperforming the latest benchmarks by 9.4% and 16.1%, respectively. Furthermore, on the independent M983 test set, our method maintained a high GCC of 0.6945, demonstrating a 9.9% improvement relative to the state-of-the-art methods. This robust and efficient framework provides a high-precision tool for large-scale mitochondrial proteomics. Full article
Show Figures

Figure 1

19 pages, 7917 KB  
Article
A Line Selection Method for Small-Current Grounding Faults Based on Time–Frequency Graphs and Image Detection
by Lei Li, Shuai Hao and Weili Wu
Electronics 2026, 15(6), 1165; https://doi.org/10.3390/electronics15061165 - 11 Mar 2026
Viewed by 188
Abstract
Aiming at the problem that the multi-scale feature interaction ability of the traditional deep learning-based line selection algorithm is insufficient, resulting in the decline of line selection accuracy, a multi-scale feature fusion line selection method based on transfer learning is proposed, abbreviated as [...] Read more.
Aiming at the problem that the multi-scale feature interaction ability of the traditional deep learning-based line selection algorithm is insufficient, resulting in the decline of line selection accuracy, a multi-scale feature fusion line selection method based on transfer learning is proposed, abbreviated as TLM-Net. Firstly, to address the issue of the insufficient generalization ability of the line selection network in small-sample scenarios, a simulation data pre-training framework is constructed, and a robust feature representation basis is established through a cross-domain knowledge transfer mechanism. Secondly, aiming at the problem of insufficient extraction of feature information by traditional algorithms, a multi-scale feature fusion network (MFFN) is designed to integrate global context information and local detail features, achieving cross-level semantic complementarity and spatial alignment optimization. Then, to enhance the representation ability of weak fault feature information, an EKA mechanism integrating variable kernel convolution is designed. The background interference is reduced through adaptive multi-region feature focusing, and the edge recognition accuracy of the model for irregular targets is improved. Finally, the pre-trained model is transferred to the target domain by adopting the transfer learning strategy, and the network parameters are fine-tuned in combination with the on-site data to achieve cross-domain adaptation of the feature space. The experimental results show that the TLM-Net algorithm’s mAP@0.5 reaches 98.5%, the accuracy rate and recall rate reach 98.3% and 96.5%, respectively, and the accuracy is improved by 37.5% compared with the original model. Full article
(This article belongs to the Special Issue Security Defense Technologies for the New-Type Power System)
Show Figures

Figure 1

19 pages, 2380 KB  
Article
DTBAffinity: A Multi-Modal Feature Engineering and Gradient-Boosting Framework for Drug–Target Binding Affinity on Davis and KIBA Benchmarks
by Meshari Alazmi
Computers 2026, 15(3), 182; https://doi.org/10.3390/computers15030182 - 10 Mar 2026
Viewed by 224
Abstract
An accurate prediction of how strongly a drug binds to its target (where the drug will have the desired effect) is very important for drug discovery. It helps select the most promising compounds and saves money by doing fewer experiments. We present DTBAffinity, [...] Read more.
An accurate prediction of how strongly a drug binds to its target (where the drug will have the desired effect) is very important for drug discovery. It helps select the most promising compounds and saves money by doing fewer experiments. We present DTBAffinity, a multi-modal regression framework that integrates chemically meaningful ligand descriptors with diverse protein sequence features in a unified gradient-boosting model. The representation of ligands includes physicochemical and topological descriptors (RDKit and Mordred), structural keys (MACCS and FP4), circular fingerprints (ECFP/Morgan), and SMILES-derived features from iFeatureOmega. For proteins, thousands of sequence-derived descriptors (composition, autocorrelations, physicochemical profiles, and evolutionary indices) from iFeatureOmega are used, together with contextual embeddings from large protein language models (ESM-1b, ESM-2). The feature matrices are cleaned up, variance filtered, z-score scaled, and univariate selected before being concatenated and modeled with regularized XGBoost ensembles. We evaluate DTBAffinity on two kinase-centric datasets that are commonly used: Davis (30,056 interactions: pKd values) and KIBA (118,254 interactions: integrated affinity scores). Various metrics are used to measure the performance, such as MSE, R2, Pearson/Spearman correlations, Concordance Index (CI), rm2, and AUPR. On Davis, DTBAffinity yields MSE = 0.1885, CI = 0.9102, and AUPR = 0.8112, and on KIBA, it gives MSE = 0.1540, CI = 0.8686, and AUPR = 0.8361; thus, it is better than the state-of-the-art baselines such as KronRLS, SimBoost, DeepDTA, and GraphDTA. The findings here imply that the combination of interpretable descriptors and contextual embeddings in a robust boosting framework is a great way to realize accurate, interpretable, and generalizable DTBA prediction. Full article
(This article belongs to the Special Issue AI in Bioinformatics)
Show Figures

Figure 1

22 pages, 11365 KB  
Article
Addressing Dense Small-Object Detection in Remote Sensing: An Open-Vocabulary Object Detection Framework
by Menghan Ju, Yingchao Feng, Wenhui Diao and Chunbo Liu
Remote Sens. 2026, 18(6), 851; https://doi.org/10.3390/rs18060851 - 10 Mar 2026
Viewed by 256
Abstract
Remote sensing open-vocabulary object detection focuses on identifying and localizing unseen categories within remote sensing imagery. However, constrained by characteristics such as dense target distribution, complex background interference, and drastic scale variations inherent to remote sensing scenarios, existing methods are prone to background [...] Read more.
Remote sensing open-vocabulary object detection focuses on identifying and localizing unseen categories within remote sensing imagery. However, constrained by characteristics such as dense target distribution, complex background interference, and drastic scale variations inherent to remote sensing scenarios, existing methods are prone to background noise interference when extracting features from dense, small target regions. This leads to weakened semantic representation and reduced localization accuracy. Therefore, we propose RS-DINO to address these challenges. Specifically: Firstly, to address the issue of small features being obscured by the background, the feature extraction module incorporates a multi-scale large-kernel attention mechanism. This expands the receptive field while enhancing local detail modelling, significantly improving the feature representation of minute targets. Secondly, a cross-modal feature fusion module employing bidirectional cross-attention achieves deep alignment between image and textual features. Subsequently, a language-guided query selection mechanism enhances detection accuracy through hybrid query strategies. Finally, to enhance the spatial sensitivity and channel adaptability of fusion features, the multimodal decoder integrates a convolutional gated feedforward network, significantly boosting the model’s robustness in dense, multi-scale scenes. Experiments on DIOR, DOTA v2.0, and NWPU-VHR10 demonstrate substantial gains, with fine-tuned RS-DINO surpassing existing methods by 3.5%, 3.7%, and 4.0% in accuracy, respectively. Full article
Show Figures

Figure 1

21 pages, 6660 KB  
Article
Infrared and Visible Multi-Scale Pyramid Cross-Layer Fusion Algorithm Based on Thermal Extended Target Separation
by An Liang, Laixian Zhang, Yingchun Li, Hao Ding, Haijing Zheng, Rong Li and Rui Zhu
Photonics 2026, 13(3), 263; https://doi.org/10.3390/photonics13030263 - 10 Mar 2026
Viewed by 191
Abstract
Infrared and visible image fusion aims to synergistically combine the thermal target saliency of infrared images with the rich textual details of visible images. To address the limitations of traditional multi-scale methods in terms of target-background contrast and detail preservation, this paper introduces [...] Read more.
Infrared and visible image fusion aims to synergistically combine the thermal target saliency of infrared images with the rich textual details of visible images. To address the limitations of traditional multi-scale methods in terms of target-background contrast and detail preservation, this paper introduces a novel multi-scale pyramid cross-layer fusion framework. The core of this framework lies in a thermal expansion-based target separation mechanism for superior hierarchical decomposition. Source images are first decomposed via a Gaussian–Laplacian pyramid for multi-resolution representation. By exploiting infrared thermal saliency and visible geometric priors, the scene is explicitly segregated into a target layer and a background layer. The target layer employs deep feature extraction based on Iteratively Reweighted Nuclear Norm minimization to sharpen thermal prominences and enhance contrast; concurrently, the background layer undergoes a cross-modal, cross-layer consistency fusion strategy, integrating spatial textures across frequency bands to maintain structural fidelity and detail richness. This dual-layer paradigm, augmented by multi-scale aggregation, ensures seamless, artifact-free fusion. To comprehensively evaluate the proposed method, systematic experiments are conducted on two benchmark datasets: TNO and RoadScene. Evaluations on the dataset demonstrate that our method outperforms state-of-the-art baselines. Extended experiments on the MSRS dataset further confirm the strong generalization capability and robustness of our method. Furthermore, systematic hyperparameter experiments determine the optimal model configuration, and ablation studies substantiate the effective contribution of both the pyramid segregation module and the IRNN optimization module to the final fusion performance. Extensive hyperparameter testing identified the optimal setup, and ablation studies confirmed the contribution of each key module. Overall, our fusion algorithm demonstrates satisfactory performance in the experiments, representing a clear advance. Full article
(This article belongs to the Special Issue Computational Optical Imaging: Theories, Algorithms, and Applications)
Show Figures

Figure 1

Back to TopTop