1. Introduction
Propelled by rapid technological development, remote sensing imaging has entered an era defined by multi-platform, multi-load and multi-modal observation [
1]. In this context, high-precision multi-source image registration has emerged as a fundamental prerequisite for advanced image processing tasks such as fusion, forming the critical foundation for all subsequent analysis [
2]. As two pivotal Earth observation modalities, optical cameras provide high-resolution imagery while Synthetic Aperture Radar (SAR) sensors enable all-weather and all-day imaging capabilities. However, optical cameras are easily affected by cloud cover, which reduces the observation efficiency, while SAR imaging suffers from significant speckle noise, resulting in poor visual quality [
3,
4]. The images obtained by a single optical/SAR remote sensing imaging system are limited and are often insufficient to meet the diverse needs of practical applications. Therefore, the complementary imaging systems that integrate optical and SAR sensors have emerged as a research focus in remote sensing [
5]. Such systems leverage the synergistic advantages of both modalities, significantly enhancing observation accuracy, data acquisition capability, and robustness against individual sensor limitations. Several successful implementations worldwide have demonstrated the feasibility of optical/SAR joint observations. For instance, the ESA Sentinel-1 (SAR) and Sentinel-2 (optical) constellation enables synergistic data fusion when temporal and spatial resolutions are properly matched [
6], significantly enhancing land surface monitoring completeness, data accuracy, and temporal effectiveness. Similarly, China’s GF series satellites (GF-1, GF-2 and high-resolution SAR satellite GF-3) can accurately detect complex surface environments, which is conducive to natural disasters and agricultural and forestry monitoring [
7]. However, most current implementations rely on non-co-aperture architectures [
8]. In these systems, multi-source imagery acquired from separate platforms at different temporal intervals introduces fundamental registration challenges due to temporal disparities, atmospheric variations, and differing geometries [
9], thereby complicating the registration process.
Beyond these systemic issues, a further complication arises from the fundamental differences in how optical and SAR images are acquired, leading to divergent imaging mechanisms and radiation characteristics [
10]. Consequently, traditional image-domain registration techniques based on region, feature, and deep learning [
11] are fundamentally challenged, compromising their accuracy and hindering real-time processing. Region-based registration methods typically involve optimizing a similarity measure between images. As the search range expands, the computational load also increases, thereby reducing the registration speed. Wang [
12] proposed a fast registration method based on block matching using grayscale normalized mutual information, which improved the registration success rate. However, this method still requires customized segmentation strategies for different image pairs, and the registration accuracy still needs further improvement. While SIFT and SURF methods are common feature-based registration methods, each presents distinct limitations. SIFT suffers from high computational cost and a propensity for false matches, reducing its efficiency. Although SURF accelerates registration while maintaining SIFT’s invariance, its sparser feature points intensify the difficulties in cross-modal scenarios, especially between optical and SAR data, where gradient characteristics fundamentally differ. Guo [
13] proposed a fast automatic registration method using Edge Point Features (EPFs) angle matching to enhance both registration accuracy and efficiency. However, its performance in optical/SAR registration remains unvalidated and is fundamentally limited by the method’s underlying assumptions, which indicates that this algorithm requires further improvement. Moreover, deep learning-based registration methods often operate under the constraint of forcing cross-modal images to appear similar, which inevitably discards valuable image details, increasing the risk of mismatches. Dou [
14] proposed a matching method that combines deep features with wavelet information. This fusion enhances the discriminative power of the features, leading to improved matching performance. Nevertheless, a fundamental challenge with these data-driven methods is their strong dependence on the training dataset and the inherent difficulty in training them effectively.
Therefore, it is necessary to address the fundamental issues of multi-source image registration from both the remote sensing imaging system and registration approaches. Although co-aperture systems exist for guidance (microwave/infrared) [
15] and communication (microwave/laser) [
16], optical/SAR integrated imaging is an emerging field, with most efforts still in the stages of system design and experimental validation. For instance, the U.S. ORS agency’s HIGHRISE project (2007) conducted a preliminary design and simulation analysis for a multi-band system [
17]. More recently, Wu et al. (2020) designed an airborne infrared/SAR system featuring shared partial structures [
18]. Li et al. (2021) developed an optical/SAR system capable of dual-band operation in the visible, near-infrared and Ka bands [
19]. Beyond hardware integration, the concept of signal-domain alignment, performing registration during image formation, represents a revolutionary paradigm shift. This unified approach to imaging and registration is a nascent but critically important research direction.
Aiming at the bottlenecks in the development of co-aperture imaging systems and multi-source image registration methods, this paper proposes a novel signal-domain consistent imaging method implemented on a newly designed airborne co-aperture system. The system employs spectral and frequency division technology to ensure pointing consistency, establishing a hardware foundation for acquiring spatially and temporally consistent optical/SAR data. The core of our method, the Fourier series high-order fitting consistent imaging method, operates in the wavenumber domain. It unifies optical and SAR imaging results into a shared pixel coordinate system, achieving sub-pixel registration accuracy by integrating real-time compensation into the SAR imaging process. In contrast to our prior time-domain approach [
20], this method significantly enhances computational efficiency and imaging speed while ensuring the alignment accuracy. Simulations involving multi-point and structural targets within the overlapping field of view (FOV) validate the feasibility and universality of the proposed wavenumber domain method for real-time deviation compensation.
The main contributions of this work are summarized as follows. (1) We establish a new signal-domain paradigm for optical/SAR co-registration by shifting it from a post-processing task to an integrated part of the wavenumber-domain SAR imaging process, thereby enabling a unified and real-time-capable workflow. (2) To realize this paradigm, we propose a co-designed system-algorithm framework that includes a novel airborne co-aperture system for synchronous data acquisition and a unified geometric model linking SAR imaging deviations to cross-modal pixel errors. (3) At the algorithmic core, we develop and implement a compensation method that models Stolt interpolation errors as a high-order Fourier series and embeds the derived correction function directly into the imaging chain, achieving source-level, sub-pixel co-registration. (4) We provide comprehensive experimental validation through simulations under both ideal and noisy conditions, demonstrating that the method maintains sub-pixel accuracy while achieving a computational speedup exceeding 24 times compared to the time-domain algorithm, confirming its accuracy, efficiency, and practical robustness.
The remainder of this paper is organized as follows.
Section 2 details the optical/SAR airborne co-aperture system and the associated imaging models.
Section 3 presents the proposed wavenumber domain consistent imaging compensation method. Experimental results and analysis are provided in
Section 4, followed by a discussion in
Section 5. Finally,
Section 6 concludes the paper.
3. Wavenumber Domain Consistent Imaging Compensation Method Based on High-Order Fourier Series Fitting
The wavenumber domain
imaging algorithm is selected as the cornerstone for our consistent imaging framework due to its superior computational efficiency over the back-projection (BP) algorithm [
26], a decisive factor for enabling real-time processing. However, this efficiency comes with an inherent trade-off; both theoretical analysis and simulation results demonstrate that a core operation within this process, the Stolt interpolation, inherently introduces systematic residual geometric deviations. This inherent flaw fundamentally limits the co-registration accuracy between SAR and optical data. Thus, this section presents a compensation method embedded directly into the
process, which effectively corrects the errors at their source while preserving its computational advantage.
We first outline the key steps of the
algorithm, which serves as the foundation for our subsequent compensation framework. The echo signal is transformed into the 2D frequency domain and expressed as
where
is the complex constant in the 2D frequency domain,
is the range frequency,
is the azimuth frequency,
is the envelope of the range spectrum,
is the envelope of the azimuth spectrum centered around the Doppler central frequency
, and
is the phase term in the 2D frequency domain.
The key steps of
imaging are reference function multiplication (RFM) and Stolt interpolation. The purpose of RFM is to eliminate the phase at the reference distance to achieve full focusing, which is generally set to the nearest slant distance
. The phase of the reference signal is
Following RFM and range compression, Equation (10) presents expansion of the nonlinear residual phase function
to the quadratic term, dividing it into three parts. The first term is the nonlinear residual azimuth modulation, which is approximated as a quadratic function of
. The second term is the residual distance drift, which is a linear function of
, with the frequency increasing approximately linearly with the square of
. The third term, the range-azimuth coupling term, can be ignored in the pure side-looking SAR imaging geometry [
27].
where
is the migration parameter.
The pivotal step in the
algorithm is the Stolt interpolation, as defined in Equation (11). Its role is to remap the range frequency from a nonlinear to a linear scale, effectively completing the SAR focusing process by compensating for the residual range and azimuth compression terms. This critical transformation forms the primary bottleneck in balancing accuracy and efficiency. The interpolation process itself is an inherent source of phase error, where the choice of interpolation algorithm and the effects of non-uniform sampling directly dictate the final image quality and introduce geometric deviations, sidelobe artifacts, and focus degradation. This geometric inaccuracy is particularly critical, as it is the chief contributor to the consistent imaging bias that violates sub-pixel co-registration requirements. The resulting phase term after Stolt interpolation is given by Equation (12).
In practice, the Stolt interpolation relies on a sinc kernel, which inherently introduces the Gibbs phenomenon. This manifests as periodic time-domain oscillations that resist complete suppression by windowing, with unsampled areas remaining particularly vulnerable. Consequently, the interpolation process imprints a distinct, oscillatory signature as residual high-order phase error, while the windowing operation itself further degrades interpolation accuracy. Additionally, geometric distortion also arises from the evolving imaging geometry, as the radar’s viewing angle changes continuously throughout the synthetic aperture.
Collectively, these inherent limitations explain the algorithm’s inferior geolocation accuracy. The deterministic, oscillatory nature of the residual high-order phase errors, as revealed by this analysis, therefore establishes them as the definitive target for compensation. The practical impact of these inaccuracies becomes critically evident during optical/SAR cross-modal registration. The residual deviations from Stolt interpolation exceed the sub-pixel registration tolerance of high-resolution optical imagery. When the SAR image is projected into the optical pixel coordinate system, these otherwise minor deviations collectively cause noticeable registration discrepancies.
Building upon this error analysis, we propose a compensation framework designed to correct these deterministic, oscillatory geometric deviations at their source within the imaging chain. The core idea is to model the Stolt-induced azimuth position error as a function of azimuth coordinate, convert this error model into a compensatory phase in the wavenumber domain, and embed this phase factor directly after the Stolt interpolation step. This approach leverages the very oscillatory nature of the error, making a high-order Fourier series its natural mathematical representation.
Driven by the need for real-time performance as well as the need to circumvent the difficulties of image-domain post-processing, we develop a compensation framework that operates directly during SAR image formation. Given the oscillatory nature of the dominant Stolt-induced phase errors, a high-order Fourier series presents a natural and mathematically suited model. The framework is implemented in three concrete steps. First, establishing the quantitative relationship between SAR imaging position errors and optical/SAR pixel deviations based on the unified geometric projection. Second, quantifying the allowable bounds of SAR geometric deviations that correspond to optical/SAR sub-pixel co-registration requirements, and acquiring the specific azimuth position error samples by simulating ideal targets, measuring their pixel deviations after uncompensated imaging, then converting them via the established relationship. Finally, fitting these sampled errors with a high-order Fourier series to derive the compensation coefficients, which are then embedded into the imaging chain.
The foundation of this method is to project the SAR imaging results directly into the optical pixel coordinate system, based on the object point to image point transformation relationship defined in Equation (1). To facilitate formula derivation and simulation experiment analysis, we introduce a set of ideal geometric assumptions. We assume that after the aircraft achieves stable flight, all sensor NED coordinate systems are parallel at all capture moments. As illustrated in
Figure 2, the origin of the NED coordinate system (point O) is defined as the point on the ground at the same distance as the starting moment of the flight. The translation vector between this point and the origin of the camera coordinate system at a certain shooting moment is
, where
represents the flight direction distance. Under the above ideal circumstances, the transformation from the NED to the camera coordinate system consists of a rotation of
around the Z-axis, followed by a rotation of
around the X-axis.
By substituting Equations (1)–(3), we can approximate the image point positions in the pixel coordinate system for both an ideal target point
and a target point
that is subject to SAR imaging position deviations. The resulting pixel deviations between the two are shown in Equations (15) and (16). This derivation yields the mathematical model that forms the basis for our real-time compensation framework.
From Equations (15) and (16), it is evident that the pixel deviations in range and azimuth are proportional to the SAR imaging position deviations in their respective directions. In the range direction, the scaling coefficient
in Equation (16) provides a relatively large tolerance margin, as the multiplication of range imaging position deviation
by the coefficient results in a sub-pixel level deviation within the range of
. Quantitative analysis based on Equation (16) and
Table 1 demonstrates that for a range resolution of 0.33 m or higher, the resulting deviation remains within the sub-pixel tolerance range. However, the azimuth direction presents a much more challenging scenario. Quantitative analysis via Equation (15) shows that the allowable SAR azimuth position deviations corresponding to optical/SAR sub-pixel level azimuth deviations fall within a very narrow interval of
. This tight bound is extremely challenging to achieve the direct azimuth sub-pixel registration for most SAR imaging results.
This quantitative finding precisely defines the precision gap, the inherent azimuth deviations of the
algorithm systematically exceed the narrow tolerance dictated by the optical imagery. To bridge this gap, a consistent imaging deviation compensation framework is essential. The core of our approach is to establish and leverage the joint relationship between optical/SAR pixel deviations and SAR imaging position deviations to solve for the specific azimuth imaging position deviation requiring compensation. This key compensation term is modeled as a high-order Fourier series dependent on azimuth positions in Equation (17), since this mathematical form inherently captures the periodic and oscillatory errors that originate from the Stolt interpolation process.
where
is the order of the Fourier series,
,
,
,
are the coefficients to be determined, and
is the azimuth position within the SAR imaging interval.
The corresponding phase factor for the azimuth imaging position deviation to be compensated in the frequency domain is expressed as follows:
By the same rationale, the optical/SAR azimuth pixel deviation must also be modeled as a high-order Fourier series, given the proportional relationship between
and
in Equation (15) and the nonlinear nature of the azimuth residual phase. The azimuth-consistent imaging compensation factor
under the co-aperture system is derived and takes the same form as the azimuth deviation phase factor
. The scaling coefficient between the pixel deviations and the imaging position deviations is denoted as
, which can be regarded as a constant within the optical/SAR overlapping FOV.
The complete flowchart of the proposed consistent imaging method is presented in
Figure 4, illustrating the overall workflow from raw data to co-registered output. This framework achieves source-level compensation by embedding the derived phase factor directly into the wavenumber domain processing chain, with the “Embedded Compensation” step serving as the real-time registration operator. Thereby, it not only addresses the systemic precision gap but also establishes a foundation for pixel-level co-registration at the time of SAR image formation.
5. Discussion
This work establishes a new paradigm for optical/SAR consistent imaging by performing geometric registration directly within the wavenumber domain imaging process itself, which overcomes the traditional reliance on complex and inefficient post-processing alignment. The experimental results robustly confirm the feasibility and computational superiority of the proposed wavenumber domain method, which completed the entire imaging and registration process in merely 72.08 s, over 24 times faster than the time-domain BP algorithm (1787.68 s).
Practical Robustness and Extendable Framework. The current validation demonstrates the core paradigm under an ideal co-aperture geometry with flat terrain. To address its operational robustness, it is crucial to recognize that the proposed method is fundamentally an extendable parametric compensation framework, not a fixed solution. The high-order Fourier series models systematic geometric biases. In practice, deviations such as residual pointing errors or low-frequency platform motion can be incorporated by constructing an updated geometric model, deriving the range and azimuth bias functions, and embedding them into the compensation series. This approach, conceptually aligned with bias compensation in interferometric SAR, allows the method to handle a class of deterministic, modelable errors. Sub-pixel registration is guaranteed under these conditions. The framework is also compatible with standard motion compensation preprocessing for handling high-frequency phase errors. The compensation step itself adds negligible computational overhead, preserving the real-time capability intrinsic to the ωk algorithm. Thus, the method provides a foundation for robust, calibrated operation beyond the ideal case.
Limitations and Pathways to Operational Robustness. While the proposed method achieves high accuracy under the controlled conditions of this study, its transition to operational use requires addressing inherent limitations and environmental complexities. First, the few residual azimuth outliers observed even in flat-terrain simulations stem from the fundamental trade-off in parametric fitting: a finite-order Fourier series minimizes the overall error across the aperture but does not guarantee zero error at every discrete point. Local signal properties, such as sidelobe interactions, can also contribute to these minor deviations. For real-world deployment involving non-flat terrain, noise, and potential calibration residuals, the core signal-domain compensation should be integrated into a hierarchical robustness framework: (1) Enhanced Geometric Modeling by incorporating a digital elevation model (DEM) to derive terrain-aware compensation coefficients; (2) Detection and Local Refinement through lightweight quality-assurance and targeted local alignment for flagged outliers; (3) System-Level Integration with confidence-aware fusion algorithms in downstream applications. This envisioned progression charts a clear pathway for evolving the method from a laboratory demonstration into a robust system component.
Building upon the extendable framework and the robustness pathways outlined above, future work will center on evolving this paradigm from an ideal-case solution to a robust tool. This includes exploring the adaptation of the proposed signal-domain compensation principle to other high-efficiency SAR imaging algorithms, such as the Chirp Scaling algorithm, to broaden the framework’s applicability. We also plan to develop a dynamic error model that incorporates real-time navigation data and high-resolution terrain information. The trial deployment of our system will be instrumental in this endeavor, providing the crucial dataset needed to probe the method’s limitations and advance its capabilities. The inherent flexibility of the high-order Fourier series compensation suggests a strong potential for adaptation, positioning it as a cornerstone for future high-fidelity, consistent imaging in topographically complex and dynamic environments.