Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,604)

Search Parameters:
Keywords = shallow network

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
27 pages, 6034 KB  
Article
Artificial Intelligence-Based Prediction of Compressive Strength in High-Performance Eco-Friendly Concrete Incorporating Recycled Waste Glass
by Ofelia Cornelia Corbu, Anca Gabriela Popa and Sepehr Ghafari
Materials 2026, 19(6), 1050; https://doi.org/10.3390/ma19061050 - 10 Mar 2026
Abstract
This study investigates the application of artificial intelligence for predicting the compressive strength of a high-performance, eco-efficient engineered cementitious composite (ECC), designated mix S8-1, A. The composite incorporates supplementary cementitious materials and alternative aggregates derived from recycled glass waste. The binder system combines [...] Read more.
This study investigates the application of artificial intelligence for predicting the compressive strength of a high-performance, eco-efficient engineered cementitious composite (ECC), designated mix S8-1, A. The composite incorporates supplementary cementitious materials and alternative aggregates derived from recycled glass waste. The binder system combines waste glass powder and silica fume, while the aggregate fraction includes recycled cobalt glass. An extensive experimental program involving 14 mixtures tested at 7, 28, 56, 90, and 120 days was performed to establish the reference mechanical and rheological properties. Mix S8-1, A achieved strength class C60/75 and workability corresponding to consistency class S4. To substantiate long-term performance, microstructural and chemical analyses were conducted on specimens preserved since 2011, using scanning electron microscopy (SEM) and X-ray fluorescence (XRF). The results confirmed a stable, densified microstructure, evidencing the long-term durability of the patented ECC formulation. For predictive modeling, a shallow feedforward artificial neural network with three hidden layers was developed and trained on 70 dataset entries representing mixture proportions and curing ages. Model performance was evaluated using cross-validation, achieving a coefficient of determination (R2) of 0.968, a mean absolute error of 1.96 MPa, and a root mean square error of 2.52 MPa. The results demonstrate that AI-based approaches can accurately predict the compressive strength of high-performance, environmentally sustainable ECCs incorporating recycled glass constituents, supporting both performance optimization and resource-efficient material design. Full article
(This article belongs to the Section Construction and Building Materials)
Show Figures

Graphical abstract

22 pages, 6941 KB  
Article
Study on the Impact of Viscoelastic Surfactants on the Reaction-Retarding Performance of Carbonate Reservoir Acidizing
by Wenhao Tian, Juan Du, Yaochen Li and Jinlong Li
Processes 2026, 14(5), 873; https://doi.org/10.3390/pr14050873 - 9 Mar 2026
Abstract
Conventional hydrochloric acid (HCl) acidizing in carbonate reservoirs is often limited by excessively rapid acid–rock reactions and preferential flow through high-permeability paths, resulting in shallow penetration and inefficient stimulation. Viscoelastic surfactant (VES)-based diverting acids have been widely applied to address these challenges; however, [...] Read more.
Conventional hydrochloric acid (HCl) acidizing in carbonate reservoirs is often limited by excessively rapid acid–rock reactions and preferential flow through high-permeability paths, resulting in shallow penetration and inefficient stimulation. Viscoelastic surfactant (VES)-based diverting acids have been widely applied to address these challenges; however, the intrinsic relationship between reaction retardation and diversion efficiency, particularly under varying shear conditions, remains insufficiently clarified. In this study, a VES-based diverting acid system formulated with erucamidopropyl hydroxysultaine (EH50) was systematically investigated through multiscale experiments, including rotating disk reaction kinetics, rheological characterization, porous core flooding, and fracture-scale plate flow tests. The results reveal a pronounced shear-dependent transition in the governing mechanism of the system. Under low-shear conditions, the VES system significantly reduces the apparent acid–rock reaction rate, with a maximum reduction of 77.3%, and exhibits a synergistic retardation effect in the presence of Ca2+, indicating mass transfer limitation. However, under high-shear porous media flow, the intrinsic retarding effect is substantially weakened due to partial disruption of the viscoelastic structure. Despite this attenuation of chemical retardation, effective diversion performance persists under dynamic flow conditions, manifested by pressure plateau behavior, enhanced flow redistribution, more distributed wormhole networks, and greater overall dissolution. Fracture-scale experiments further demonstrate that the diversion acid suppresses excessive inlet etching and promotes spatially distributed etching patterns favorable for fracture conductivity maintenance. These findings clarify that reaction retardation and diversion are distinct yet dynamically coupled mechanisms, whose relative dominance depends on shear intensity and ionic environment. The proposed shear-responsive mechanism framework provides new insight into the design and optimization of VES diverting acid systems for carbonate reservoir stimulation. Full article
(This article belongs to the Topic Advanced Technology for Oil and Nature Gas Exploration)
Show Figures

Figure 1

35 pages, 83521 KB  
Article
AI-Native Multi-Scale Attention Fusion for Ubiquitous Aerial Sensing: Small Object Detection in UAV Imagery
by Ke Ma, Zhongjie Zhang, Jiarui Zhang and Jian Huang
Electronics 2026, 15(5), 1100; https://doi.org/10.3390/electronics15051100 - 6 Mar 2026
Viewed by 90
Abstract
Ubiquitous aerial sensing with unmanned aerial vehicles (UAVs) is becoming an essential component of AI-native perception systems, motivated by the trend toward edge deployment and potential integration with future sixth-generation (6G)-connected aerial networks. In this work, we focus on improving the perception-side accuracy [...] Read more.
Ubiquitous aerial sensing with unmanned aerial vehicles (UAVs) is becoming an essential component of AI-native perception systems, motivated by the trend toward edge deployment and potential integration with future sixth-generation (6G)-connected aerial networks. In this work, we focus on improving the perception-side accuracy and computational efficiency of small-object detection in UAV imagery. However, small object detection in high-altitude UAV imagery remains highly challenging due to the extremely low pixel occupancy of targets and the severe multi-scale interference introduced by complex backgrounds. To address these limitations, we propose a Multi-scale Attention Fusion Network (MAF-Net), an AI-native paradigm for real-time small object detection in UAV imagery. The proposed approach enhances small-target representation and robustness through three key designs. First, a density-adaptive anchor optimization strategy is developed by combining K-means++ clustering with an IoU-based distance metric, enabling anchors to better match scale variation under diverse object densities. Second, a multi-scale feature reinforcement module is introduced to strengthen fine-grained detail preservation by integrating shallow feature maps via skip connections and hierarchical aggregation. Third, a dual-path attention mechanism is employed to jointly model channel importance and spatial localization, improving discriminative feature calibration in cluttered aerial scenes. Extensive experiments on three public benchmarks (AI-TOD, DOTA, and RSOD) demonstrate that MAF-Net consistently outperforms the baseline detector, achieving mAP@0.5 gains of 14.1%, 11.28%, and 22.09%, respectively. These results confirm that MAF-Net provides an effective and deployment-friendly solution for robust small object detection, supporting real-time UAV-based inspection and AI-native ubiquitous aerial sensing applications. Full article
Show Figures

Figure 1

22 pages, 25254 KB  
Article
BFI-YOLO: A Lightweight Bidirectional Feature Interaction Network for Aluminum Surface Defect Detection
by Tianyu Guo, Songsong Li, Weining Li, Qiaozhen Zhou and Luyang Shi
Electronics 2026, 15(5), 1080; https://doi.org/10.3390/electronics15051080 - 4 Mar 2026
Viewed by 144
Abstract
As a critical step in industrial quality control, surface defect detection in aluminum materials remains challenging for minor defects despite advances in deep learning. To address this, this paper proposes an enhanced YOLOv8-based model, BFI-YOLO, that incorporates a Bidirectional Multi-scale Residual Network. Specifically, [...] Read more.
As a critical step in industrial quality control, surface defect detection in aluminum materials remains challenging for minor defects despite advances in deep learning. To address this, this paper proposes an enhanced YOLOv8-based model, BFI-YOLO, that incorporates a Bidirectional Multi-scale Residual Network. Specifically, we design a Bidirectional Multi-scale Feature Pyramid Network (BM-FPN) based on BiFPN to strengthen cross-scale feature fusion. The parameter-free SimAM attention module is embedded to enhance subtle defect responses while suppressing background texture interference, without introducing additional computational overhead.Furthermore, we develop a Multi-scale Residual Convolution (MSRConv) module to capture defects of varying sizes on aluminum surfaces comprehensively. MSRConv utilizes multi-scale convolutional kernels to adapt to cross-scale defect features and retains shallow details via residual connections, thereby strengthening the model’s representation of fine defects. Extensive experiments on the public TAPSDD dataset show that BFI-YOLO achieves a precision of 91.3%, a recall of 89.8%, and mAP@0.5 of 92.1%, with only 1.8 M parameters. Compared to the baseline, BFI-YOLO reduces parameters by 40% while increasing mAP@0.5 by 4.2%, effectively balancing detection accuracy and lightweight performance. Optimized for resource-constrained industrial platforms such as embedded systems and mobile robots, BFI-YOLO meets real-time monitoring requirements while achieving competitive detection accuracy, providing an efficient and practical solution for metal surface defect detection. Full article
Show Figures

Figure 1

21 pages, 4214 KB  
Article
A Lightweight and Sustainable UAV-Based Forest Fire Detection Algorithm Based on an Improved YOLO11 Model
by Shuangbao Ma, Yongji Hui, Yapeng Zhang and Yurong Wu
Sustainability 2026, 18(5), 2436; https://doi.org/10.3390/su18052436 - 3 Mar 2026
Viewed by 127
Abstract
Unmanned aerial vehicle (UAV) forest fire detection is vital for forest safety. However, early-stage UAV fire scenarios often involve small targets, weak smoke signals, and strict onboard resource constraints, which pose significant challenges to existing detectors. To improve the speed and accuracy of [...] Read more.
Unmanned aerial vehicle (UAV) forest fire detection is vital for forest safety. However, early-stage UAV fire scenarios often involve small targets, weak smoke signals, and strict onboard resource constraints, which pose significant challenges to existing detectors. To improve the speed and accuracy of UAV forest fire detection, this paper proposes a lightweight fire detection algorithm, AHE-YOLO, specifically designed for UAVs. The proposed method adopts a coordinated lightweight design to improve feature preservation and cross-scale representation under limited computational budgets. Specifically, the Adaptive Downsampling (ADown) module preserves shallow fire-related cues during spatial reduction, improving sensitivity to small flame and smoke targets. The high-level screening-feature fusion pyramid network (HS-FPN) introduces cross-scale attention to promote more discriminative multi-level feature interaction while reducing redundant computation. Furthermore, the Efficient Mobile Inverted Bottleneck Convolution (EMBC) module is employed to improve receptive-field efficiency and feature selectivity under lightweight constraints, further enhancing detection accuracy and inference speed. Finally, the performance of AHE-YOLO is comprehensively evaluated through ablation and comparative experiments on the same dataset. The final experimental results show that YOLO-AHE achieves a mean average precision (mAP) of 94.8% while reducing model parameters by 39.7%, decreasing FLOPs by 27.0%, and shrinking the model size by 36.4%. In addition, its inference speed improves by 16.5%. Beyond detection performance, the proposed framework supports sustainable forest monitoring by enabling early fire warning with reduced computational and energy demands, showing strong potential for real-time deployment on resource-constrained UAV and edge platforms. Full article
Show Figures

Figure 1

28 pages, 2976 KB  
Article
DeepHits: A Multimodal CNN Approach to Hit Song Prediction
by Michael Nofer, Valdrin Nimani and Oliver Hinz
Mach. Learn. Knowl. Extr. 2026, 8(3), 58; https://doi.org/10.3390/make8030058 - 2 Mar 2026
Viewed by 1093
Abstract
Hit Song Science aims to forecast a song’s success before release and benefits from integrating signals beyond audio content alone. We present DeepHits, an end-to-end multimodal network that combines (i) log-Mel spectrogram embeddings from a compact residual 2D-CNN, (ii) frozen multilingual BERT lyric [...] Read more.
Hit Song Science aims to forecast a song’s success before release and benefits from integrating signals beyond audio content alone. We present DeepHits, an end-to-end multimodal network that combines (i) log-Mel spectrogram embeddings from a compact residual 2D-CNN, (ii) frozen multilingual BERT lyric embeddings, and (iii) structured numeric features including high-level Spotify audio descriptors and contextual metadata (artist popularity, release year). Evaluated on 92,517 tracks from the SpotGenTrack dataset, DeepHits achieves a macro-F1 of 52.20% (accuracy 82.63%) in the established three-class setting and a macro-F1 of 23.15% (accuracy 37.00%) in a ten-class decile benchmark. To contextualize fine-grained performance, we report capacity-controlled shallow baselines, including metadata-only and early/late fusion variants, and show that the deep multimodal model provides a clear gain over these references (e.g., metadata-only: macro-F1 20.92%; accuracy 34.22%). Ablation results indicate that removing metadata yields the largest degradation in class-balanced performance, highlighting the strong predictive value of artist popularity and release year. Overall, DeepHits provides a reproducible benchmark and modality analysis for fine-grained popularity prediction under class imbalance. Full article
Show Figures

Figure 1

22 pages, 23521 KB  
Article
Superpixel-Tokenized and Frequency-Modulated Hybrid CNN–Transformer for Remote Sensing Semantic Segmentation
by Xinlin Xie, Chenhao Chang, Yunyun Yang and Gang Xie
Remote Sens. 2026, 18(5), 754; https://doi.org/10.3390/rs18050754 - 2 Mar 2026
Viewed by 176
Abstract
Remote sensing semantic segmentation is fundamental for fine-grained urban scene understanding, which in turn provides pixel-level semantic insights for urban development and environmental surveillance. However, existing hybrid segmentation architectures fail to incorporate intrinsic geometric and physical priors, inevitably leading to structural fragmentation, boundary [...] Read more.
Remote sensing semantic segmentation is fundamental for fine-grained urban scene understanding, which in turn provides pixel-level semantic insights for urban development and environmental surveillance. However, existing hybrid segmentation architectures fail to incorporate intrinsic geometric and physical priors, inevitably leading to structural fragmentation, boundary ambiguity, and spatial misalignment of heterogeneous features. Therefore, we propose a Superpixel-Tokenized and Frequency-Modulated Hybrid CNN–Transformer network (SFCT-Net) for remote sensing semantic segmentation. The proposed network integrates superpixel tokens and high-frequency constraints to preserve structural integrity and boundary precision. First, our Superpixel-Tokenized Linear Position Attention (STLPA) module replaces rigid window tokens with semantic superpixels to ensure object integrity with linear computational complexity. Second, we construct a Frequency-Modulated Deformable Edge Refinement (FMDER) module that leverages high-frequency spectral priors to modulate deformable sampling, achieving robust boundary recovery. Finally, we develop the Spatial–Semantic Feature Coupling (SSFC) module, which employs a dual-branch strategy to correct spatial drift and align deep semantic features with shallow details. Experiments conducted on our self-built Taiyuan Satellite Remote Sensing Dataset (TSRSD) along with the ISPRS Vaihingen and Potsdam benchmark datasets demonstrate that our proposed SFCT-Net delivers state-of-the-art performance and efficiency by fusing superpixel and frequency priors for robust structural and boundary recovery. Full article
Show Figures

Figure 1

27 pages, 13433 KB  
Article
HAMD-DETR: A Wind Turbine Defect Detection Method Integrating Multi-Scale Feature Perception
by Shuhao Tian, Pengpeng Zhang and Lin Liu
Energies 2026, 19(5), 1235; https://doi.org/10.3390/en19051235 - 2 Mar 2026
Viewed by 200
Abstract
Wind turbines operating in harsh environments are prone to surface defects that compromise efficiency and safety. Traditional convolutional neural networks lack sufficient multi-scale feature representation, while Transformer-based methods suffer from excessive computational complexity. This study proposes HAMD-DETR, an end-to-end detection framework for wind [...] Read more.
Wind turbines operating in harsh environments are prone to surface defects that compromise efficiency and safety. Traditional convolutional neural networks lack sufficient multi-scale feature representation, while Transformer-based methods suffer from excessive computational complexity. This study proposes HAMD-DETR, an end-to-end detection framework for wind turbine defect identification. The framework consists of three key components: an Adaptive Dynamic Multi-scale Perception Network (ADMPNet), a Hierarchical Dynamic Feature Pyramid Network (HDFPN), and a Dynamic Frequency-Domain Feature Encoder (DFDEncoder). Firstly, ADMPNet integrates multi-scale dynamic integration fusion and adaptive inception depthwise convolution for feature extraction. Then the HDFPN balances deep semantic and shallow detail features through pyramid adaptive context extraction and gradient refinement modules. At last, DFDEncoder enhances feature discrimination through frequency-domain transformation. Experiments on wind turbine datasets demonstrate that HAMD-DETR achieves 58.6% mAP50 and 31.7% mAP50-95, representing improvements of 3.1% and 2.1% over the baseline RT-DETR. The proposed method reduces computational complexity by 27.2% and parameters by 30% while achieving a 151.9 FPS inference speed. These results validate HAMD-DETR’s effectiveness for wind turbine defect detection and demonstrate its potential for intelligent operation and maintenance applications. Full article
(This article belongs to the Section F5: Artificial Intelligence and Smart Energy)
Show Figures

Figure 1

22 pages, 4784 KB  
Article
Diversity, Assembly, and Habitat-Driven Dynamics of Microbial Communities in Eutrophic Dianchi Lake, Southwest China
by Jun Chen, Zhizhong Zhang, Bowen Wang, Jiaojiao Yang, Guangxiu Cao, Jinyan Dong, Tao Li and Yanying Guo
Microorganisms 2026, 14(3), 554; https://doi.org/10.3390/microorganisms14030554 - 28 Feb 2026
Viewed by 230
Abstract
Microbial communities are key regulators of ecological processes in aquatic ecosystems and serve as sensitive indicators of environmental change. Here, we investigated the diversity, assembly mechanisms, and spatial differentiation of bacterial and fungal communities across three representative regions of Dianchi Lake—a large, shallow, [...] Read more.
Microbial communities are key regulators of ecological processes in aquatic ecosystems and serve as sensitive indicators of environmental change. Here, we investigated the diversity, assembly mechanisms, and spatial differentiation of bacterial and fungal communities across three representative regions of Dianchi Lake—a large, shallow, eutrophic plateau lake in Southwest China characterized by severe nutrient enrichment and organic pollution. The lake was divided into a submerged macrophyte remnant zone (SubmP), the heavily polluted Caohai area (hPollut), and a cyanobacterial bloom zone (HABs). Amplicon sequencing of the 16S rRNA and ITS genes revealed 7862 bacterial and 3141 fungal OTUs, spanning 69 bacterial phyla (1128 genera) and 9 fungal phyla (477 genera). Although 69 dominant bacterial genera (e.g., Flavobacterium) and 9 dominant fungal genera (e.g., Metschnikowia) were shared across regions, pronounced spatial heterogeneity was observed, primarily driven by total nitrogen and dissolved oxygen. Taxonomic richness and abundance were decoupled: rare (RT) and intermediate taxa (IT) accounted for the most richness, whereas abundant taxa (AT) dominated the total abundance but exhibited comparatively low diversity. IT and RT displayed significantly higher Shannon diversity and greater network robustness than AT; bacterial RT showed the highest robustness (0.35–0.45), while fungal IT demonstrated superior resilience. Community assembly was largely governed by stochastic processes (59–99% contribution), yet deterministic selection exerted stronger effects on IT and RT, particularly for bacteria in SubmP, where habitat heterogeneity enhanced environmental filtering. Functional prediction revealed distinct ecological strategies, with enhanced nitrogen cycling in hPollut, phototrophy in HABs, and pollutant degradation in SubmP. Collectively, these findings demonstrate that rare and intermediate taxa, rather than numerically dominant populations, underpin microbial stability and spatial differentiation in eutrophic lakes, highlighting the importance of nitrogen management and habitat heterogeneity in lake restoration. Full article
(This article belongs to the Special Issue Interaction Between Microorganisms and Environment)
Show Figures

Graphical abstract

26 pages, 7153 KB  
Article
A Deformable Dual-Branch Visual State-Space Network for Landslide Identification with Multi-Scale Recognition and Irregular Boundary Enhancement
by Bowen Du, Wanchao Huang, Junchen Ye, Bin Tong and Yueping Yin
Remote Sens. 2026, 18(5), 707; https://doi.org/10.3390/rs18050707 - 27 Feb 2026
Viewed by 205
Abstract
In recent years, rapid and reliable interpretation for emergency response to landslides and other geological hazards has become increasingly important. This paper presents DFmamba, an improved deformable dual-branch visual state-space network, to address engineering challenges such as missed large landslide bodies, boundary shifts, [...] Read more.
In recent years, rapid and reliable interpretation for emergency response to landslides and other geological hazards has become increasingly important. This paper presents DFmamba, an improved deformable dual-branch visual state-space network, to address engineering challenges such as missed large landslide bodies, boundary shifts, and loss of small-scale details. DFmamba mitigates the limited effective receptive field and window-partition constraints that often prevent existing methods from balancing large-area semantic consistency, multi-scale detection, precise boundary delineation, and computational efficiency. It employs a parallel encoder with a convolutional branch and a Visual State-Space Model (VSSM) branch to jointly capture local textures and global context. In the decoder, deformable residual blocks (DRB) enhance geometric modeling of irregular boundaries, while multi-scale feature alignment and a shallow high-frequency injection (MFP) mechanism strengthen boundary responses and preserve fine details. Experiments on the public CAS dataset against representative CNN-, Transformer-, and SSM-based baselines show that DFmamba achieves improved Precision, Recall, F1-score, and IoU, with stable performance across multi-scale scenarios, demonstrating strong robustness for landslide segmentation. Full article
Show Figures

Figure 1

22 pages, 5070 KB  
Article
DEM-Assisted Topography-Conditioned and Orientation-Adaptive Siamese Network for Cross-Region Landslide Change Detection
by Jing Wang, Haiyang Li, Shuguang Wu, Guigen Nie, Yukui Yu and Zhaoquan Fan
Remote Sens. 2026, 18(5), 702; https://doi.org/10.3390/rs18050702 - 26 Feb 2026
Viewed by 192
Abstract
Automated landslide change detection using remote sensing imagery is critical for rapid disaster response. However, landslide change detection using bi-temporal optical imagery is frequently degraded by cross-region domain shifts and by the elongated, anisotropic morphology of landslide boundaries, leading to substantial pseudo-change alarms. [...] Read more.
Automated landslide change detection using remote sensing imagery is critical for rapid disaster response. However, landslide change detection using bi-temporal optical imagery is frequently degraded by cross-region domain shifts and by the elongated, anisotropic morphology of landslide boundaries, leading to substantial pseudo-change alarms. To suppress pseudo-changes and improve cross-region robustness, we propose a DEM-assisted topography-conditioned and orientation-adaptive Siamese network (DEMO-Net) that injects topographic inductive bias through terrain-conditioned feature modulation and orientation-adaptive convolutions. Specifically, DEM-derived multi-channel priors are encoded to predict spatially varying FiLM parameters that recalibrate shallow optical features, suppressing spurious changes while preserving discriminative cues. In addition, we introduce an adaptive-oriented attention convolution that leverages a DEM-derived aspect to guide sparse multi-orientation aggregation via shared-kernel transformation, enabling direction-aware receptive-field alignment for elongated and direction-varying landslide structures without costly global attention. Experiments on the GVLM benchmark under a 5-fold site-wise cross-region protocol show that DEMO-Net achieves 85.17% F1 and 74.26% mIoU, outperforming the strongest CNN baseline FC-EF by 5.05% and 7.20%, respectively. These results demonstrate the effectiveness of jointly leveraging terrain-conditioned calibration and physically consistent orientation-aligned feature extraction for robust cross-region landslide change detection. Full article
Show Figures

Figure 1

28 pages, 5678 KB  
Article
FKIFM-DETR: A Multi-Domain Fusion-Based Transformer Framework for Small-Target Detection in UAV Remote Sensing Imagery
by Fan Yang, Long Chen, Xiaoguang Wang, Yang Zhang, Hongyu Li, Min He and Li Shen
Remote Sens. 2026, 18(5), 700; https://doi.org/10.3390/rs18050700 - 26 Feb 2026
Viewed by 206
Abstract
Unmanned Aerial Vehicle (UAV) remote sensing has become essential for real-time earth observation applications, including precision agriculture, traffic monitoring, and disaster response. However, small-target detection in UAV aerial imagery still faces critical challenges: extreme scale variation due to variable flight altitudes, background interference [...] Read more.
Unmanned Aerial Vehicle (UAV) remote sensing has become essential for real-time earth observation applications, including precision agriculture, traffic monitoring, and disaster response. However, small-target detection in UAV aerial imagery still faces critical challenges: extreme scale variation due to variable flight altitudes, background interference from complex terrain, and insufficient pixel information for tiny objects. To address these issues, this work proposes FKIFM-DETR, a real-time transformer-based detection framework leveraging multi-domain information fusion. First, a Spatial-Frequency Fusion Module (SFM) is designed to integrate spatial and frequency-domain features for capturing fine-grained target details while suppressing background noise; second, a High–Low Frequency Block (HL-Block) is introduced to separately process high-frequency local details and low-frequency global context, balancing detail retention and semantic awareness; finally, a Channel Feature Recalibration-Enhanced Feature Pyramid Network (SPCR-FPN) is employed to strengthen the interaction between shallow spatial features and deep semantic features. On the VisDrone2019 dataset, FKIFM-DETR achieves 6.3% and 5.3% improvements in mAP@0.5 and mAP@0.5:0.95 over the RT-DETR baseline, respectively; evaluations on TinyPerson and HIT-UAV datasets further demonstrate its cross-scenario applicability. These results demonstrate the potential of FKIFM-DETR for practical UAV remote sensing applications such as crowd surveillance, vehicle tracking, and emergency rescue. Full article
Show Figures

Figure 1

25 pages, 9279 KB  
Article
A Multi-Scale Global Fusion-Based Method for Surface Fissure Extraction from UAV Imagery
by Mingxi Zhou, Min Ji, Fengxiang Jin, Zhaomin Zhang, Fengke Dou and Xiangru Fan
Sensors 2026, 26(5), 1440; https://doi.org/10.3390/s26051440 - 25 Feb 2026
Viewed by 241
Abstract
The prevalence of ground fissures in deformation-affected areas has intensified, presenting serious risks to both operational safety and the local natural environment. Fissures in these disturbed terrains are typically characterized by elongated morphologies and large-scale variations, which pose substantial challenges to accurate feature [...] Read more.
The prevalence of ground fissures in deformation-affected areas has intensified, presenting serious risks to both operational safety and the local natural environment. Fissures in these disturbed terrains are typically characterized by elongated morphologies and large-scale variations, which pose substantial challenges to accurate feature extraction. To address these complexities, this paper proposes a semantic segmentation network termed MGF-UNet. In the shallow layers, we integrate multi-scale feature sensing (MFS) and grouped efficient multi-scale attention (EMA) to sharpen anisotropic textures and boundary details under high-resolution representations. For the deeper layers, a Token-Selective Context Transformer (TSCT) is designed to perform selective global modeling on high-level semantic features, effectively capturing long-range dependencies while preserving the structural integrity of elongated fissures. Meanwhile, we employ feature-wise linear modulation (FiLM) to derive pixel-wise affine parameters from shallow structures, which pre-modulate deep features and strengthen cross-level interactions. In the decoder, a Fourier transform-based adaptive feature fusion (AFF) module suppresses background noise and enhances boundary contrast, followed by cross-scale aggregation for final prediction.Benchmark tests conducted on the mining-area fissure dataset (MFD) and road-based datasets demonstrate that MGF-UNet achieves an accuracy of 78.2%, a Dice score of 81.4%, and an IoU of 68.6%, outperforming existing mainstream networks. The results confirm that MGF-UNet provides an effective solution for automatic fissure extraction in deformation-prone environments, offering significant potential for geohazard monitoring and ecological restoration. Full article
Show Figures

Figure 1

22 pages, 54739 KB  
Article
Synergizing Residual and Dense Architectures for Fine-Grained Oil Palm Grading: A Deep Feature Concatenation Approach
by Yang Luo, Anwar P. P. Abdul Majeed, Zaid Omar, Sandeep Jagtap, Guillermo Garcia-Garcia and Yi Chen
Mathematics 2026, 14(5), 769; https://doi.org/10.3390/math14050769 - 25 Feb 2026
Viewed by 206
Abstract
Accurate grading of Oil Palm Fresh Fruit Bunches (FFB) is pivotal for maximizing agricultural yield, yet manual assessment in unstructured environments remains labor-intensive and subjective. While Convolutional Neural Networks (CNNs) offer an automated solution, the conventional strategy of scaling network depth often yields [...] Read more.
Accurate grading of Oil Palm Fresh Fruit Bunches (FFB) is pivotal for maximizing agricultural yield, yet manual assessment in unstructured environments remains labor-intensive and subjective. While Convolutional Neural Networks (CNNs) offer an automated solution, the conventional strategy of scaling network depth often yields diminishing returns or overfitting on moderately sized datasets. To overcome these limitations, this study proposes the Deep Feature Concatenation (DFC) framework. Rather than deepening a single architecture, this methodology synergizes the spatial hierarchy preservation of ResNet50 with the dense feature-reuse mechanisms of DenseNet121. This fusion creates a composite representation space that captures complementary inductive biases. To ensure computational efficiency, the framework decouples representation learning from inference. Principal Component Analysis (PCA) retains 99% of explained variance while compressing features by 68%. These optimized representations are classified using shallow linear probes. Validated on a single-source dataset expanded to 4000 images (derived from 466 original samples) using a rigorous “Parent–Child” split to prevent data leakage, DFC achieved a peak accuracy of 97.75%. McNemar’s statistical test indicated that this performance outperforms the ResNet50 baseline (p=0.039) for SVM classifiers. However, it is critical to note that these results represent a proof of concept based on a limited biological sample size, particularly for rare defect classes. While the model achieved 100% detection accuracy for critical defects within the specific validation set, the high synthetic-to-original ratio necessitates cautious interpretation regarding external validity. This framework provides a practical foundation for future research into high-precision, low-latency grading systems, but multi-center validation on larger, independent datasets is required to confirm broad generalizability across diverse plantation environments. Full article
(This article belongs to the Special Issue Application of Machine Learning and Data Mining, 2nd Edition)
Show Figures

Figure 1

33 pages, 5215 KB  
Article
Towards Lightweight and Multi-Scale Scene Classification: A Lie Group-Guided Deep Learning Network with Collaborative Attention
by Xuefei Xu and Chengjun Xu
J. Imaging 2026, 12(3), 94; https://doi.org/10.3390/jimaging12030094 - 24 Feb 2026
Viewed by 222
Abstract
Remote sensing scene classification (RSSC) plays a crucial role in Earth observation. Current deep learning methods, while accurate, tend to focus on high-level semantic features and overlook complementary shallow details such as edges and textures. Moreover, conventional CNNs are limited by fixed receptive [...] Read more.
Remote sensing scene classification (RSSC) plays a crucial role in Earth observation. Current deep learning methods, while accurate, tend to focus on high-level semantic features and overlook complementary shallow details such as edges and textures. Moreover, conventional CNNs are limited by fixed receptive fields, whereas transformers incur high computational costs. To address these limitations, we propose the Lie Group lightweight multi-scale network (LGLMNet), a lightweight multi-scale network that integrates Lie Group covariance features. It employs a dual-branch architecture combining Lie Group machine learning (LGML) for shallow feature extraction and a deep learning branch for high-level semantics. In the deep branch, we design a parallel depthwise separable convolution block (PDSCB) for multi-scale perception and a spatial-channel collaborative attention mechanism (SCCA) for efficient global–local modeling. A cross-layer feature fusion block (CLFFB) effectively merges the two branches. Compared with state-of-the-art methods, the proposed LGLMNet achieves accuracy improvements of 2.14%, 2.32%, and 1.12% on UCM-21, AID, and NWPU-45 datasets, respectively, while maintaining a lightweight structure with only 2.6 M parameters. Full article
(This article belongs to the Section AI in Imaging)
Show Figures

Figure 1

Back to TopTop