A New Combined Adjustment Model for Geolocation Accuracy Improvement of Multiple Sources Optical and SAR Imagery

: Numerous earth observation data obtained from different platforms have been widely used in various ﬁelds, and geometric calibration is a fundamental step for these applications. Traditional calibration methods are developed based on the rational function model (RFM), which is produced by image vendors as a substitution of the rigorous sensor model (RSM). Generally, the ﬁtting accuracy of the RFM is much higher than 1 pixel, whereas the result decreases to several pixels in mountainous areas, especially for Synthetic Aperture Radar (SAR) imagery. Therefore, this paper proposes a new combined adjustment for geolocation accuracy improvement of multiple sources satellite SAR and optical imagery. Tie points are extracted based on a robust image matching algorithm, and relationships between the parameters of the Range-Doppler (RD) model and the RFM are developed by transformed into the same Geodetic Coordinate systems. At the same time, a heterogeneous weight strategy is designed for better convergence. Experimental results indicate that our proposed model can achieve much higher geolocation accuracy with approximately 2.60 pixels in the X direction and 3.50 pixels in the Y direction. Compared with traditional methods developed based on RFM, our proposed model provides a new way for synergistic use of multiple sources remote sensing data.


Introduction
With the development of satellite imaging technology, it is increasingly common to obtain repeated observations of the same object from multiple sources in a short time, which provides dozens of imagery widely used in many fields, such as 3D reconstruction [1], change detection [2] and semantic classification [3]. Nowadays, The application of multiple sources airborne and spaceborne remote sensing imagery is increasing popular in archaeological and cultural heritage as a supplement to traditional methods [4], which will provide sufficient texture information. Terrestrial results obtained by laser scanning suffer from high cost and missing data, whereas the combination of photogrammetry provides an affordable and practical approach for the production of 3D models. Compared with spaceborne imagery, the application of airborne remote sensing images is widespread due to its high resolution which will provide enough details of buildings. In 2014, Xu et al. proposed a methodology by intergrating laser scanning and image-based 3D reconstruction techniques for the the production of 3D models [5]. Meyer et al. investigated an optimized Unmanned Aerial Vehicles (UAV) system for the reconstruction of large scale cultural heritage sites [6]. A digital 3D model of Asinou Church in Cyprus is obtained using a consumer-level DJI platform equipped with a GoPro camera, and a 3D printer was used to create a physical model of the church [7]. Moreover, multispectral and hyperspectral In 2015, Jeong et al. investigated the performance of images from IKONOS, QuickBird and KOMPSAT-2 [40].Redundant observations were involved for geolocation accuracy improvement of multiple sources satellite images [41]. In contrast, the integration of optical and SAR images are seldom investigated. Furthermore, most of the previous studies are experimented based on the RFM, which is not suitable for the geometric processing of multiple sources optical and SAR images in most cases.
In this paper, we propose a new and generic combined adjustment model designed for optical and SAR satellite images. When considering aerial remote sensing images, coefficients of the RFM for optical imagery should be produced first by users before the application of our proposed model. By introducing the relationships between coordinates defined in the Geodetic Coordinate System and Cartesian Coordinate System, parameters of the RD model are transformed into the same system with the RFM. Therefore the normal equations for the combined adjustment model are developed based on an imagespace compensation model. A heterogeneous weight strategy is introduced for better convergence. With the help of a popular modified Least-Square method, the ill-conditioned problem can be solved efficiently.
The remainder of this paper is organized as follows-the basic principles of the RFM based combined adjustment model are introduced in Section 2. Our proposed combined adjustment model and the determination of the heterogeneous weight strategy is shown in Section 3. In Section 4, experimental results are shown to verify the efficiency of our proposed method using multiple optical and SAR images covering the Mount Song area. Conclusions and discussions are drawn in Section 5.

Basic Principle of the RFM
Generally, the RSM is composed of various on-orbit information of satellite platforms, which leads to a complicated form. Therefore, the RFM is proposed as a substitution of the RSM. Usually, the relationship between image-space coordinates and object-space coordinates are described by two polynomials as: where (x, y) is the normalized image-space coordinate. (P, L, H) denotes the normalized latitude, longitude, and height in object-space. Num S , Den S , Num L and Den L are thirdorder polynomials consisting of 80 coefficients marked as a i , b i , c i and d i (i = 0, 1, 2, · · · , 19). And the normalized coordinates can be obtained according to Equation (2).
where (φ, λ, h) is the geodetic latitude, longitude and height calculated with different offset and scale factors, respectively; (s, l) represents the image sample and line number in pixels with pixel (0, 0) is the top-left of the image.
The RFM can fit the RSM well in most cases. However, geolocation error still exists due to the low measurement accuracy of on-orbit sensors. A commonly used model for systematically error compensation is the affine transformation model. In Equation (3), r and c are extracted image-space coordinates; a 0 , a 1 , a 2 , b 0 , b 1 , and b 2 are designed affine transformation coefficients. Usually, systematic errors can be greatly eliminated based on a translation model with a 0 and b 0 . Hence, normal equations based on the RFM can be simplified and described as follows [42]: where V is the residual vector, A and B are the designed coefficient matrices containing partial derivatives of the unknowns; X A and X B are the correction vectors of the affine transformation parameters and object-space coordinates, respectively. l are the vectors of residual errors and P denotes the designed weight matrix. The normal equations can be established from Equation (4), according to the principle of least-squares adjustment:

Overview
Generally, optical remote sensing imagery can be processed based on the RFM due to its simplicity and standardization. Instead, the geometric processing of SAR imagery is usually conducted based on the RD model due to the lack of RPCs. In this paper, a new combined adjustment model that aggregates the RD model and the RFM is proposed for geometric calibration of multiple sources optical and SAR imagery. The workflow is shown in Figure 1. Firstly, the sensor orientation of SAR images is processed for systematic error compensation, which can be considered as the coarse-calibration stage. Secondly, conjugate tie points are extracted using a feature-based OS-SIFT method, which is a more robust and efficient remote sensing image-matching method compared with others in computer vision. [43]. Subsequently, the proposed combined adjustment model consisting of the RD model and RFM is applied to fulfill the fine-calibration stage. Therefore, the geolocation accuracy of calibrated SAR and optical images can be improved significantly, which facilitates further photogrammetric applications.

Sensor Orientation of SAR Imagery
The RD model is usually considered as the rigorous sensor model for SAR images, which is composed of three equations: the Range equation, the Doppler equation and the Earth model equation as follows: where R represents the measurement range between the target and the sensor, R T and R S are position vectors of the target and SAR sensor, respectively; V T and V S serve as corresponding velocity vectors of the target and SAR sensor, f Dc denotes the Doppler centroid frequency; (X, Y, Z) denotes the target position; h is the target height relative to the surface of the earth, R e and R p are the equatorial radius and the polar radius of the Earth [32]. According to Equation (6), the geolocation accuracy of SAR images are mainly influenced by the slant range measurement error, azimuthal time error, Doppler center frequency error, ephemeris error of the satellite platform, and the topographic error [30]. Therefore, the calibration of SAR images can be complicated due to different error sources.The slant range measurement error and azimuthal time error play an important role among all of these factors, which leads to geolocation errors in the X and Y direction, respectively. Especially, the slant range measurement error is associated with different combinations of bandwidth and pulse width of the SAR sensor, which can be calibrated with a simple static model by correcting the internal calibration time delay error and atmospheric time delay error [39].
After the calibration of slant range measurement error, the geolocation accuracy of SAR images can be improved by several meters.The coarse-calibration can be achieved by updating sensor parameters of the RD model. Normally, the geolocation accuracy of coarse-calibrated SAR imagery cannot meet the requirement for further photogrammetric applications. Traditional methods for geolocation accuracy improvement of SAR images usually depend on the assistance of additional reference data-such as GCPs and LiDAR data. The collection of these reference data usually requires considerable financial and human resources. Hence, a free combined adjustment model is proposed for the production of remote sensing data with very-high geolocation accuracy.

Unification of Coordinate System
Previous studies have revealed different types of combined adjustment models for multiple sources remote sensing data. Most of them are developed based on the RFM, and almost all methods highly depend on some additional existed reference data as mentioned above. Differently, we proposed a new combined adjustment model designed for the geometric calibration of SAR and optical images.
For optical images, interior and exterior orientation parameters are necessary for the establishment of the Collinear Condition Equations [44]. Considering the complexity and inconsistency between different platforms, coefficients of the RFM are provided by image vendors as a substitution. Usually, the production of RPCs is conducted under the Geodetic Coordinate System, whereas the RD model is defined based on the Cartesian Coordinate System. Therefore, parameters of the RD model are transformed into the Geodetic Coordinate System according to Equation (7).
where (X, Y, Z) denote the space rectangular coordinates; φ and λ are the radians of the geodetic latitude and longitude coordinates; N represents the Earth Curve Radius with where e denotes the Earth Curve.
Giving the relationship between the geodetic coordinates and space rectangular coordinates, parameters of the RD model can be translated into the Geodetic Coordinate System. Differently, the RD model are developed based on the Range Equation and Doppler Equation, which indicates that the geolocation accuracy should be reprojected into the image space to keep in line with the RFM.

Combined Normal Equations
As demonstrated in Equation (3), an affine transformation model can compensate for the geolocation error of the RFM efficiently. The situation can be much more complicated when it comes to the RD model because we cannot develop a formula that indicates the relationship between the object-space coordinates and image-space coordinates directly. As demonstrated above, the slant range measurement error and azimuthal time delay error are the main factors that influence the reprojection error of the image-space coordinates. For simplicity, the RD model can be rewritten as follows: where G r and G c are influenced by the image-space coordinates. To simplify the normal equations, a traditional "stop and go" assumption is utilized here [45]. Hence, parameters of the RD model can be represented by the image-space coordinates as: where (x, y) represent the image-space coordinates; (dx, dy) are the corresponding corrections; R near is the measured range corresponding to the first pixel in the image sample direction; t 0 denotes the start imaging time in the image line direction; c s represents the speed of light; f s is the sampling frequency in the slant range direction; R re f denotes the reference slant range distance, dt 0 is the reference of the Doppler time; d i represent the coefficients of the Doppler center frequency; W and w s are the image width and widthspace; dt 1 denotes the azimuth time; pr f is the pulse repetition frequency; p i and q i are coefficients to fitting the position and velocity vector of the satellite platform.
Hence, normal equations assembling the RD model and RFM can be developed as: Assuming the number of optical and SAR images are m and n, the coefficient matrix of normal equations can be obtained in the form of Equation (12) and Equation (13).
Partial derivatives of optical images based on the RFM can be easily obtained, whereas the formula corresponding to G r and G c are more complicated. With the help of Equation (11), partial derivatives including ∂G r ∂x , ∂G r ∂y , ∂G c ∂x and ∂G c ∂y can be derived as: The partial derivatives from G r and G c to P, L and H can be represented as Equation (  15).
Given Equation (8), partial derivatives of X, Y and Z to φ, λ and H are easy to be derived. Hence, normal equations can be established in the form of matrices as: where the subscript O and S represent matrices designed for optical and SAR images, respectively. The establishment of the combined normal equations provides a generic way for the geometric processing of SAR and optical imagery. However, the absolute geolocation accuracy after free block adjustment cannot meet our requirements without the help of GCPs. Therefore, the heterogeneous weight strategy, defined as P O and P S , is introduced for better convergence.

Heterogeneous Weight Strategy
Traditional block adjustment methods are developed with an identify weight matrix, which indicates that the contribution of all elements involved is the same. However, the geometric performance of multiple sources remote sensing data varies greatly according to different platforms. Generally, the performance of most world-class SAR imagery can achieve better than 10 m, whereas the geolocation accuracy of different optical imagery ranging from several meters to hundreds of meters. Therefore, the heterogeneous weight strategy is proposed to ensure that images with higher accuracy will contribute more during the combined adjustment process, which ensures an optimum result will be obtained without using GCPs.
Different from traditional identify weight matrix, the heterogeneous weight matrix composed of P O and P S are defined as follows: where the subscripts O and S represent designed for optical and SAR imagery, respectively; m denotes the adaptive parameter to keep balance between different weights; q is the resolution of each involved image; H is the height of the optical satellite, θ and ω are measured rolling angle and pitching angle of the optical imaging sensor; θ represent the looking angle of the SAR sensor; l denotes the number of images divided into groups according to different principles; l sum is the total number of involved optical/SAR images; C and A represent the relative geolocation error of each image computed during each iteration. Without the help of GCPs, the above strategies provide a generic guidance for the determination of an optimum weight for each observation. In practice, the determination of m O and m S is conducted based on more than one test. A converged solution will be obtained with the aid of some popular modified Least-Square method, and further photogrammetric applications can be investigated after the fine-calibration of the multiple sources dataset.

Experimental Dataset
Considering the revisited period of different commercial satellites with very high resolution, the collection of multiple sources and multiple observation dataset is timeconsuming and expensive. In comparison, the comparison of some open access datasets, such as the Sentinel-1 data, are much lower. Hence, multiple remote sensing images obtained from the Jilin-1 (JL-1) optical small satellite constellation and the Gaofen-3 (GF-3) SAR satellite are involved to verify the efficiency of our proposed method. As the first commercial optical satellite constellation, it is composed of 14 small satellites by the end of 2020. The resolution of JL-1 optical images is 0.92 m and the swath width is 11 km. Benefited from the non-fixed camera on the platform, images can be obtained at different imaging times and looking angle, which provides multiple observation dataset with more information. The GF-3 satellite is the first civilian microwave remote sensing imaging satellite. The nominal resolution of obtained GF-3 images varies from 1 m to 500 m with the swath width ranging from 10 km to 650 km. In this experiment, 7 JL-1 optical images obtained from 4 different platforms and 3 GF-3 SAR images covering a rural area around the Mount Song area are selected. Detailed information is listed in Table 1, and the geometric distribution can be found in Figure 2.  Influenced by the imaging modality, targets in SAR images are difficult to be identified compared with optical images. Therefore, 5 check point sets are extracted from an existed database of control points. All check points located in the corner of border areas or road intersections. Moreover, 147 tie points are extracted automatically based on an efficient multiple source image matching method [43]. Figure 3 shows the geographical distribution of involved optical and SAR images, as well as the distribution of extracted corresponding tie points.

Performance of the Combined Adjustment
Before the adjustment process, the slant range measurement error is firstly calibrated based on our previous statistic results [39]. Table 2 gives the geolocation results before and after the coarse-calibration. The initial geolocation accuracy of both GF-3 SAR images is approximately 11 pixels in the X direction and 8 pixels in the Y direction, which is in accordance with previous studies [25]. After calibration, the geolocation error in the X direction influenced by the slant range measurement error is eliminated greatly, whereas the variation of geolocation error in the Y direction is negligible. Based on the coarse-calibration step, the geolocation accuracy of both GF-3 SAR images is improved significantly, which guarantee a better result of the whole combined adjustment process.  To verify the efficiency of our proposed combined adjustment model, a traditional RFM based adjustment process is conducted in comparison. Hence, the RPCs of SAR images need to be produced in advance. Generally, a terrain-dependent method relying on well-distributed GCPs performs the best [46]. In contrast, the terrain-independent method is commonly applied with the help of an open-source DEM. Based on the established spatial grid, the fitting accuracy of produced RPCs can reach sub-pixels [47]. Hence, RPCs of GF-3 SAR images are produced after coarse-calibration and the RFM based combined adjustment model including all SAR and optical images can be developed according to Equation (4).   Table 3 gives the root mean square error (RMSE) of the whole dataset processed with different models. The RFM based combined adjustment model improves the geolocation accuracy of all datasets from about 146.36 pixels in the X direction and 111.75 pixels in the Y direction to 61.97 pixels in the X direction and 72.23 pixels in the Y direction with an identity weight matrix. In contrast, the performance of our proposed model is better than the traditional one, with an accuracy of 59.43 pixels in the X direction and 71.89 pixels in the Y direction.  Figure 4 gives the relative geolocation accuracy between optical and SAR images after combined adjustment. Without the help of GCPs, geolocation results after free combined adjustment cannot meet the requirement for further processing, such as the production of geocoded 3D products and target localization. Hence, the heterogeneous weight strategy is applied for better convergence. The final results are listed in Table 3. After processing, the geolocation accuracy increases to approximately 3 pixels in the X direction and 4.5 pixels in the Y direction. At the same time, our proposed model also shows better performance than the traditional RFM based combined adjustment model. Furthermore, Figure 5 shows the error distribution after processed by these two methods. Compared with the traditional one, our proposed model also gives the best performance in convergence, which can be derived from the consistency between images.

Discussions and Conclusions
Traditional methods for the geometric processing of multiple sources optical and SAR imagery are developed based on the RFM. Different from most optical satellite imagery, RPCs of SAR images are not always provided by image vendors, especially for some popular SAR sensors such as the TerraSAR-X and Sentinel-1 satellite. Therefore, the production of RPCs for SAR imagery has to be produced by users additionally. Moreover, the fitting accuracy is highly dependent on the terrain.
Aiming at finding a generic and simple way for geometric calibration of multiple sources optical and SAR images, we proposed a new combined adjustment model. Unlike traditional RFM-based methods, the slant range measurement error of SAR images obtained from the GF-3 satellite is calibrated based on our previous work. After the coarsecalibration step, tie points are automatically extracted from both optical and SAR images. The combined adjustment model is established by reprojecting parameters of the RD model into the same coordinate system with the RPCs. Together with an additional heterogeneous weight strategy, our proposed model gives the best performance. Compared with traditional methods, our proposed model provides a new way for the integration of multiple sources optical and SAR data, which do not introduce extra fitting accuracy. Further, this proposed model also enables the application for precise photogrammetric reconstruction.