Scene Heatmap-Guided Adaptive Tiling and Dual-Model Collaboration-Based Object Detection in Ultra-Wide-Area Remote Sensing Images

Fuwen Hu; Yeda Li; Jiayu Zhao; Chunping Min

doi:10.3390/sym17122158

,

and

¹

School of Mechanical and Material Engineering, North China University of Technology, Beijing 100144, China

²

Information Center of Ministry of Natural Resources, Beijing 100812, China

^*

Author to whom correspondence should be addressed.

Symmetry2025, 17(12), 2158;https://doi.org/10.3390/sym17122158

This article belongs to the Topic Image Processing, Signal Processing and Their Applications

Version Notes

Order Reprints

Abstract

This work addresses computational inefficiency in ultra-wide-area remote sensing image (RSI) object detection. Traditional homogeneous tiling strategies enforce computational symmetry by processing all image regions uniformly, ignoring the intrinsic spatial asymmetry of target distribution where target-dense coexist with vast target-sparse areas (e.g., deserts, farmlands), thereby wasting computational resources. To overcome symmetry mismatch, we propose a heat-guided adaptive blocking and dual-model collaboration (HAB-DMC) framework. First, a lightweight EfficientNetV2 classifies initial 1024 × 1024 tiles into semantic scenes (e.g., airports, forests). A target-scene relevance metric converts scene probabilities into a heatmap, identifying high-attention regions (HARs, e.g., airports) and low-attention regions (LARs, e.g., forests). HARs undergo fine-grained tiling (640 × 640 with 20% overlap) to preserve small targets, while LARs use coarse tiling (1024 × 1024) to minimize processing. Crucially, a dual-model strategy deploys: (1) a high-precision LSK-RTDETR-base detector (with Large Selective Kernel backbone) for HARs to capture multi-scale features, and (2) a streamlined LSK-RTDETR-lite detector for LARs to accelerate inference. Experiments show 23.9% faster inference on 30k-pixel images and reduction in invalid computations by 72.8% (from 50% to 13.6%) versus traditional methods, while maintaining competitive mAP (74.2%). The key innovation lies in repurposing heatmaps from localization tools to dynamic computation schedulers, enabling system-level efficiency for Ultra-Wide-Area RSIs.

Keywords:

remote sensing image; adaptive tile partitioning; scene heat-map; real-time detection Transformer; object detection

Scene Heatmap-Guided Adaptive Tiling and Dual-Model Collaboration-Based Object Detection in Ultra-Wide-Area Remote Sensing Images

Abstract

Article Metrics

Citations

Article Access Statistics