Image Mosaicking Approach for a Double-Camera System in the GaoFen2 Optical Remote Sensing Satellite Based on the Big Virtual Camera

The linear array push broom imaging mode is widely used for high resolution optical satellites (HROS). Using double-cameras attached by a high-rigidity support along with push broom imaging is one method to enlarge the field of view while ensuring high resolution. High accuracy image mosaicking is the key factor of the geometrical quality of complete stitched satellite imagery. This paper proposes a high accuracy image mosaicking approach based on the big virtual camera (BVC) in the double-camera system on the GaoFen2 optical remote sensing satellite (GF2). A big virtual camera can be built according to the rigorous imaging model of a single camera; then, each single image strip obtained by each TDI-CCD detector can be re-projected to the virtual detector of the big virtual camera coordinate system using forward-projection and backward-projection to obtain the corresponding single virtual image. After an on-orbit calibration and relative orientation, the complete final virtual image can be obtained by stitching the single virtual images together based on their coordinate information on the big virtual detector image plane. The paper subtly uses the concept of the big virtual camera to obtain a stitched image and the corresponding high accuracy rational function model (RFM) for concurrent post processing. Experiments verified that the proposed method can achieve seamless mosaicking while maintaining the geometric accuracy.


Introduction
Linear array push broom is a widely used imaging mode for the high resolution optical satellite (HROS). To increase the resolution, the camera is typically designed with a long focal length to achieve a narrow field angle. To increase the field of view (FOV), it is necessary to assemble additional TDI-CCD detectors together because the existing TDI-CCD detector cannot achieve the large field of view requirement, such as IKONOS, Pleiades-HR, Worldview series, ZiYuan3, etc. [1][2][3]. Another method to enlarge the field of view is to use double-cameras attached with a high-rigidity support concurrently with push broom imaging. Launched in 2014, the GaoFen2 (GF2) remote sensing satellite is a Chinese high resolution optical satellite equipped with a double-camera push broom imaging system resulting in a high resolution of 0.81 m; a 45 km width strip can be obtained, which is four times the size that IKNONS can obtain. However, the image of each individual camera in the double-camera system has an independent optical imaging system and imaging model, which creates additional work in the subsequent processing. Therefore, it is necessary to investigate the corresponding image mosaicking approach for the double-camera system.
Researchers have focused on investigations involving stitching several satellite images together [4,5]; there are two methods that are traditionally used. The first method is to rectify all the images to the same geodetic coordinate system, and then stitch the rectified images by image registration and mosaicking to produce the complete image. This method has high geometric accuracy but high complexity in image rectification and registration [6][7][8]. The second method is to utilize the overlapping regions to match the homonymy points for stitching the image while disregarding the geometric imaging model for each image [9][10][11]. The stitched image that results from this method lacks a rigorous imaging model, therefore, it has a poor geometric accuracy and cannot adequately meet the surveying and mapping requirements. Researchers have proposed an inner FOV stitching method based on the virtual detector line built in the camera coordinate system for multi-TDI-CCD mosaicking and achieved an ideal result, which provided the basic idea for exploring the mosaicking approach using multiple cameras [12,13]. The difference and difficulty for a double-camera system mosaicking compared with the multi-TDI-CCD mosaicking is the unstable relative installation relationship caused by the changing thermal environment, that is to say, even though we calibrated the relative installation relationship by one image scene, for other scenes, the relative installation relationship will probably has a tiny change, which will directly influent the mosaicking accuracy.
In this paper, we propose an image mosaicking approach for double-camera systems on the GF2 satellite based on the big virtual camera (BVC) in the satellite body coordinate system, which is built with no internal distortion and its FOV covers all the camera's FOV. According to the original rigorous imaging model and the updated rational function model (RFM) by relative orientation of each camera, images obtained at the same time by the double-camera system are re-projected to the same big virtual camera coordinate system to produce the entire central projection stitched image by feather processing alone. Moreover, the high accuracy rational function model (RFM) of the stitched image can be obtained concurrently, which will provide the geometric information required for post processing. The proposed approach has been validated for precision and robustness, and it works under varying topographical conditions and achieves the goal of fully automated image data pretreatment on the ground.

Rigorous Imaging Model for the Single Camera
In the double-camera push-broom imaging system on the GF2 satellite, the principal optic axes of the cameras are installed in the same plane as the greatest extent and the angle between them is 2.01 • , as shown in Figure 1a. Therefore, double-cameras share the same orbit and attitude data when taking concurrent images, while each single camera has its own installation angles and internal optical parameters. The detailed information of the single camera is listed in Table 1. (VSD) is used for the five-collinear TDI-CCD detectors to generate the entire stitched image from the single camera, while the big virtual camera (BVC) is used for the double-cameras to generate the entire stitched image from the double-camera imaging system. The goal of image mosaicking for the double-camera system is to seamlessly stitch ten image strips obtained simultaneously by the ten TDI-CCD detectors and to produce a complete wide coverage stitched image based on the virtual single detector and the big virtual camera. The imaging parameters of the rigorous geometric imaging model can be divided into the satellite auxiliary data and the camera parameters. The satellite auxiliary data of time, attitude and orbit help to convert the satellite body coordinates into object coordinates, which determines the ray of light from the projection center to the ground points in the satellite body coordinate system. The interior LOS parameters help to convert the image coordinates into the camera coordinates, which determines the accurate LOS of each detector in the camera coordinate system [14,15]. For the single push-broom camera, the rigorous imaging model of each TDI-CCD detector can be determined as follows:  The focal plane of the single camera is composed of five-collinear TDI-CCD detectors within the field of view and there are many overlapping pixels between the adjacent TDI-CCD detectors for mosaicking. The overlapping regions of the double-camera imaging system can be divided into the camera overlapping region and the detector overlapping region as shown in Figure 1b,c. The camera overlapping region is between the double cameras, and the detector overlapping region is between the two adjacent single TDI-CCD detectors in the same single camera. The virtual single detector (VSD) is used for the five-collinear TDI-CCD detectors to generate the entire stitched image from the single camera, while the big virtual camera (BVC) is used for the double-cameras to generate the entire stitched image from the double-camera imaging system. The goal of image mosaicking for the double-camera system is to seamlessly stitch ten image strips obtained simultaneously by the ten TDI-CCD detectors and to produce a complete wide coverage stitched image based on the virtual single detector and the big virtual camera.
The imaging parameters of the rigorous geometric imaging model can be divided into the satellite auxiliary data and the camera parameters. The satellite auxiliary data of time, attitude and orbit help to convert the satellite body coordinates into object coordinates, which determines the ray of light from the projection center to the ground points in the satellite body coordinate system. The interior LOS parameters help to convert the image coordinates into the camera coordinates, which determines the accurate LOS of each detector in the camera coordinate system [14,15]. For the single push-broom camera, the rigorous imaging model of each TDI-CCD detector can be determined as follows: where ψ x and ψ y are the look angles [16,17] between the line of sight (LOS) and the vertical axis and horizontal axis of the focal plane. S is the sequence number of the single CCD detector, and a 0 , a 1 , a 2 , a 3 and b 0 , b 1 , b 2 , b 3 are the corresponding internal parameters.
is the position vector of the projection center in the WGS84 coordinate system, which is interpolated from ephemeris time observations. R body camera is the installation matrix from the camera coordinate system to the satellite body coordinate system; and R J2000 body is the attitude matrix from the satellite body coordinate system to the J2000 coordinate system, which is interpolated from the attitude time observations under the J2000 coordinate system; and R WGS84 J2000 represents the transformation matrix from the ECF coordinate system to the ECI coordinate system, which is based on the IAU 2000 precession-nutation model from the International Earth Rotation and Reference Systems Service (IERS) conventions.
The interior LOS of the virtual single detector can be easily determined by: where A 0 is the mean value of the internal parameters (a 0 ) i i = 1, 2, 3, 4, 5 of the five TDI-CCD detectors, by which the virtual single detector is placed in the center line of the multi-TDI-CCD along the flight direction. B 0 is equal to b 0 and B 1 is equal to b 1 of the leftmost TDI-CCD1, which eliminates internal distortion for the virtual single detector. Additionally, the original auxiliary data should undergo steady-state processing [18,19] to generate the virtual single images. The steady-state processing is a method to correct the attitude oscillation so as to increase RFM fitting precision for the rigorous imaging model.

Rigorous Imaging Model of the Big Virtual Camera
The single camera auxiliary data can also be applied to the BVC using steady-state processing. The BVC parameters consist of the exterior installation matrix and the interior LOS parameters. Based on the symmetrical installation relationship in the double-camera system shown in Figure 1a, we can assume that the BVC focal length equates to the single camera and the BVC field of view covers the entire imaging system, which is equivalent to placing a big virtual detector on the focal plane, as shown in Figure 1c. The exterior BVC installation matrix is considered the unit matrix, which means the LOS of each detector in the camera coordinate system is equal to the LOS in the satellite body coordinate system. To determine the interior BVC LOS, the number of virtual detectors and the field of view should be determined first. N A and N B represent the number of detectors of the virtual single detectors A and B. According to the design criterion, the number of detectors in the camera overlapping region in the satellite body coordinate system is N Camera . Therefore, the number of BVC detectors is: Based on the above analysis, the LOS x i (s) y i (s) 1 T in the single camera coordinate system of the image point (s, l) i on the single virtual detector can be determined as follows: where i = A, B, A 0i and B 0i , B 1i are the interior LOS coefficients of the single virtual detector. The serial number of virtual detectors A and B can be represented by S A = 0, . . . , N A − 1, and are the installation matrices from the camera A and B coordinate systems to the satellite body coordinate system. As shown in Figure 1c, we can determine the endpoint LOSs for single virtual detectors A and B, A start : Subsequently, the LOS of A start , A end , B start and B end can be projected to the plane Z b to obtain their components in the X b and Y b directions, therefore, projected coordinates in the satellite body coordinate system for the four endpoints (x A0 , y A0 ), (x A1 , y A1 ), (x B0 , y B0 ), and (x B1 , y B1 ) can be calculated as follows: The projected BVC coordinates of the two endpoints (x BVD start , y BVD start ) and x BVD end , y BVD end in the satellite body coordinate system can be calculated as follows: The BVC can be designed with no internal distortion by dividing the interior LOS equally according to the number of virtual detectors, and we can subsequently determine the LOS of the detector S(s = 0, . . . , N V ) in the satellite body coordinate system as follows: Then, we can determine the BVC rigorous imaging model as follows: The advantage of placing the virtual detector in the center line of the double-camera's detectors along the track direction is to reduce the mosaic and virtualization errors. Based on the determined BVC rigorous imaging model, the corresponding RFM can be determined to generate the complete stitched images.

Image Mosaic Workflow Based on VSD and BVC
One straightforward method to produce the complete stitched image is to generate the stitched image from each single camera first, subsequently, stitch the two images to generate the complete stitched image from the double-cameras. However, it can be time-consuming to resample all the original single images twice. The image mosaic method workflow is based on the virtual single detector and the big virtual camera is designed with six primary steps as shown in Figure 2a. original single images twice. The image mosaic method workflow is based on the virtual single detector and the big virtual camera is designed with six primary steps as shown in Figure 2a.
Step 1: Determine the rigorous imaging model (RIM) for original single camera Step 2: Determine the RFM for virtual single detector (VSD) Step 3   The high accuracy camera parameters on the satellite is necessary in the proposed mosaicking Step 1: Determine the rigorous imaging model (RIM) for the original single camera. The RI M a and RI M b for the double-cameras can be constructed based on the collinearity condition equation. The high accuracy camera parameters on the satellite is necessary in the proposed mosaicking approach. The camera parameters that include the exterior installation angles and interior LOS parameters can be determined using an on-orbit high accuracy geometric calibration method [20][21][22]. The projection center and attitude of each imaging line can be interpolated from the original GPS and attitude data.
Step 2: Determine the RFM for the virtual single detector (VSD). The VSD shares the same installation angles with RIM and the interior VSD LOS can be designed with no distortion according to the RIM field of view. Based on the new integration time from averaging the original integration times, the interpolated projection center and the smoothed attitude data can be obtained to determine RFM a and RFM b for the left and right VSDs. The RFM a and RFM b are the steady-state reimaging model of VSD a and VSD b , and an excellent replacement for RI M a and RI M b , which have filtered the observation noise in the auxiliary data through the least square method and, thus, has a more accurate fitting precision [18].
Step 3: Generate the virtual single images (VSI)s of the camera overlap region. Although the consistency of the image positioning accuracy between the adjacent TDI-CCD detectors can be ensured after an on-orbit high accuracy calibration, the consistency of the image positioning accuracy between the adjacent double cameras is hard to maintain due to the inevitable changes of their relative installation relationship caused by the external environmental temperature, which will lead to the dislocation of the camera overlap region. Therefore, to ensure seamless mosaicking solely based on the geometrical information, the relative orientation of the RFM a and RFM b should be determined. The two original single images obtained by the leftmost TDI-CCD detector in the right single camera and the rightmost TDI-CCD detector in the left single camera are projected to their corresponding VSD according to the RFM a and RFM b , and their VSIs are generated using image resampling for the following relative orientation. Forward-projection is based on the rigorous imaging model and backward-projection is based on the RFM in steady-state reimaging processing, as shown in Figure 2b.
Step 4: The relative orientation of the two virtual single images (VSIs). Many homonymy points for the two adjacent VSIs can be matched automatically using the SFIT algorithm, and the image space With the updated RFM a new and RFM b new , the consistency of VSD a and VSD b positioning accuracy can be ensured, which is the key factor for generating the complete stitched image from the double-cameras.
Step 5: Determine the RFM for the big virtual camera (BVC). Based on the rigorous imaging model of the single camera, the rigorous imaging model of the big virtual camera can be determined, and combined with the steady-state processed auxiliary data, the high accuracy RFM BVC of the complete stitched image can be produced, which will be applied to the final complete stitched image.
Step 6: Generate the complete stitched images. To generate the virtual single images in the BVC coordinate system, twice forward and backward intersections are performed.
(1) First, for the image point (s, l) original of the original single image, the detector number s determines the interior orientation elements, and imaging line l determines the imaging time so that the exterior orientation elements can be interpolated by time from the attitude and orbit observation. Thus, the corresponding object space coordinate (B, L, H) of image point (s, l) original will be calculated using the forward projection on elevation surface H using the rigorous imaging model. H can be interpolated from the reference elevation surface (such as SRTM DEM).
(2) The image point (s, l) VSD on the corresponding virtual single detector will be calculated using the backward projection from the object space coordinate (B, L, H) using the VSD RFM.
(3) Then, the updated object space coordinate (B, L, H) can be forward intersected from the image point (s, l) VSD based on the RFM a\b new updated using the relative orientation, and H is interpolated from the reference elevation surface.
(4) The final image point (s, l) BVC on the big virtual detector can be backward intersected from the object space coordinate (B, L, H) based on the RFM BVC .
(5) Thus, point (s, l) BVC on the big virtual detector and point (s, l) original on the original single image can be correlated. The digital number of (s, l) BVC will be determined from (s, l) original using gray resampling.
(6) Repeat the above steps until every pixel for all the single virtual images on the big virtual detector is complete; then, ten single virtual images can be produced.
(7) Finally, the single virtual images produced will be stitched based on the same initial pixel coordinates for the big virtual detector in the big virtual camera, and feather processing without performing the local correction in the overlapping region can produce the seamless mosaic image.

Consistency of the Image Positioning Accuracy
Because the geometric imaging model is important in the proposed mosaicking approach, the consistency of the TDI-CCD detectors' positioning accuracy is the geometric foundation for the mosaic accuracy. As shown in Figure 3, G is the object space point of the image space point P in the overlapping region of the stitched image. The image space points P 1 and P 2 in the adjacent images are determined using backward-projection from G to achieve the conjugate relationship; this means that P 1 and P 2 are homonymy points. In other words, the corresponding object space points G 1 and G 2 of P 1 and P 2 based on their own geometric imaging model should be as close as possible and the distance of the two object space points, which reflects the consistency of the positioning accuracy between the adjacent single images.
Repeat the above steps until every pixel for all the single virtual images on the big virtual detector is complete; then, ten single virtual images can be produced.
(7) Finally, the single virtual images produced will be stitched based on the same initial pixel coordinates for the big virtual detector in the big virtual camera, and feather processing without performing the local correction in the overlapping region can produce the seamless mosaic image.

Consistency of the Image Positioning Accuracy
Because the geometric imaging model is important in the proposed mosaicking approach, the consistency of the TDI-CCD detectors' positioning accuracy is the geometric foundation for the mosaic accuracy. As shown in Figure 3, G is the object space point of the image space point P in the overlapping region of the stitched image. The image space points P1 and P2 in the adjacent images are determined using backward-projection from G to achieve the conjugate relationship; this means that P1 and P2 are homonymy points. In other words, the corresponding object space points G1 and G2 of P1 and P2 based on their own geometric imaging model should be as close as possible and the distance of the two object space points, which reflects the consistency of the positioning accuracy between the adjacent single images.  The positioning accuracy is determined by the accuracy of the satellite auxiliary data and the camera parameters. Using an on-orbit high accuracy geometric calibration [16,17] for the doublecameras, the camera interior LOS parameters can be continuously verified, while small changes will exist in the installation angles due to the changing thermal environment. Therefore, for the detector The positioning accuracy is determined by the accuracy of the satellite auxiliary data and the camera parameters. Using an on-orbit high accuracy geometric calibration [16,17] for the double-cameras, the camera interior LOS parameters can be continuously verified, while small changes will exist in the installation angles due to the changing thermal environment. Therefore, for the detector overlap region of the adjacent collinear TDI-CCD detector, the adjacent TDI-CCD detector scans the same object line at approximately the same time; then, the same attitude and orbit auxiliary data can be interpolated at the same time. Therefore, the consistency of their positioning accuracy can be achieved. For the camera overlap region of the adjacent cameras, the imaging time of the homonymy points is different due to the inevitable installation error, which causes the interpolated auxiliary data of the same scanning line to be slightly different [19]. Additionally, the unstable relative installation relationship will directly cause the consistency of their positioning accuracy not meet the seamless mosaicking requirements. Therefore, the relative orientation should be determined for the double-cameras.

Influence of the Elevation Error
The difference between the interpolated elevation and the actual elevation that is caused by the limit of the absolute positioning accuracy and the elevation data precision is another key factor for the mosaic accuracy.
As shown in Figure 4, in the overlapping region of the adjacent single image, G is the intersection point of the homonymy points P 1 and P 2 for the actual elevation, and ∆H is the elevation error. ∆l represents the object space projection distance of the mosaic error caused by ∆H, as: where the directional angles α 1 and α 2 of the homonymy points P 1 and P 2 are the bias FOV angle of the single CCD detector in the satellite body coordinate system along the track.
Sensors 2017, 17, 1441 9 of 17 overlap region of the adjacent collinear TDI-CCD detector, the adjacent TDI-CCD detector scans the same object line at approximately the same time; then, the same attitude and orbit auxiliary data can be interpolated at the same time. Therefore, the consistency of their positioning accuracy can be achieved. For the camera overlap region of the adjacent cameras, the imaging time of the homonymy points is different due to the inevitable installation error, which causes the interpolated auxiliary data of the same scanning line to be slightly different [19]. Additionally, the unstable relative installation relationship will directly cause the consistency of their positioning accuracy not meet the seamless mosaicking requirements. Therefore, the relative orientation should be determined for the doublecameras.

Influence of the Elevation Error
The difference between the interpolated elevation and the actual elevation that is caused by the limit of the absolute positioning accuracy and the elevation data precision is another key factor for the mosaic accuracy.  As shown in Figure 4, in the overlapping region of the adjacent single image, G is the intersection point of the homonymy points P1 and P2 for the actual elevation, and ΔH is the elevation error. Δl represents the object space projection distance of the mosaic error caused by ΔH, as: where the directional angles α1 and α2 of the homonymy points P1 and P2 are the bias FOV angle of the single CCD detector in the satellite body coordinate system along the track. For the non-overlapping area, Δv represents the virtualization error caused by the elevation error as follows: where α and β represent the bias FOV angles of the image homonymy point Q1 on the original single image and Q2 on the corresponding single virtual image. The larger the intersection angle of the homonymy point's LOS, the greater the influence of the elevation error will be, therefore, to ensure the mosaic and virtualization accuracy, the required interpolated elevation accuracy should be achieved. The elevation error will rarely affect the mosaic accuracy of the detector overlap region due to collinear installation, but has a significant effect on the mosaic accuracy of the camera overlap region. Placing the virtual detector in the center line of the double-camera's TDI-CCD detectors along the track direction can divide the elevation error equally For the non-overlapping area, ∆v represents the virtualization error caused by the elevation error as follows: ∆v = ∆H × |tan α − tan β| (11) where α and β represent the bias FOV angles of the image homonymy point Q 1 on the original single image and Q 2 on the corresponding single virtual image. The larger the intersection angle of the homonymy point's LOS, the greater the influence of the elevation error will be, therefore, to ensure the mosaic and virtualization accuracy, the required interpolated elevation accuracy should be achieved. The elevation error will rarely affect the mosaic accuracy of the detector overlap region due to collinear installation, but has a significant effect on the mosaic accuracy of the camera overlap region. Placing the virtual detector in the center line of the double-camera's TDI-CCD detectors along the track direction can divide the elevation error equally to all the detectors to ensure the mosaic and virtualization accuracy. After the on-orbit high accuracy calibration and relative orientation, the accuracy of the image positioning and the interpolated elevation from the global DEM can be ensured. Based on the analysis above, the goal is to achieve a seamless mosaic image using image feather processing when stitching the single BVC virtual images directly based on their image coordinate for the same big virtual detector coordinate system.

Experimental Data
Four sets of double-panchromatic-images from the GF2 satellite were used in the experiments to verify the reliability of the proposed approach. Table 2 lists the primary information for the images. The four study areas are located in Songshan, Anyang and Dongying, China, and the double-images have the same size. Songshan is a mountainous area with an elevation difference of 1359 m and an average elevation of 431 m, while Anyang and Dongying are flatlands with small elevation differences. Scenes A and B, images of Songshan, were captured in the same orbit on October 27, 2015. The experimental data are the panchromatic images after radiometric-correction, and the same high accuracy calibrated camera parameters and their corresponding auxiliary data. Digital orthophoto maps (DOM) and digital elevation models (DEM) obtained from the WorldView2 satellite was used as reference data for the geometric accuracy evaluation. The DOM resolution is 0.3 m, and the DEM resolution is 2 m. To ensure the appropriate number and distribution of the auto-matched ground control points (GCPs) is used for the geometric accuracy assessment, the selected satellite images have little cloud and water cover. The matching process was performed using the SIFT algorithm, and the matching accuracy was better than 0.3 pixels [23]. Visual and geometric accuracy evaluation of the complete stitched images were performed to comprehensively validate the effectiveness of our approach.

Visual Evaluation
Based on the high accuracy and stable interior calibrated parameters, seamless mosaicking of the detector overlapping region can be achieved directly based on the geometric relationship. Then feather processing is performed during the local processing, therefore, a significant dislocation of the seamline will occur if there is a significant image positioning deviation for the homonymy points on the adjacent single CCD detector, which can be effectively detected by visual evaluation. Due to the unstable small changes in the relative installation relationship of the double-cameras, relative orientation is required for the seamless mosaicking of the camera overlap region. Figure 5 shows the experimental results for Scene A GF2 satellite imagery. Areas 1, 2 and 3 are camera overlapping regions, while area 4 is the detector overlapping region between detector 2 and 3 in camera A, and area 5 is the detector overlapping region between detector 3 and 4 in camera B. The details of areas 1-5 in Figure 5a are shown in Figure 5b-k. Areas 1, 2, and 4 are in mountainous areas, and areas 3 and 5 are in plain areas. Areas 4 and 5 could achieve seamless mosaicking when feather processing is performed for the detector overlapping regions. That is due to the high accuracy and stable on-orbit calibrated interior parameters, which are the most important factors for the consistency of the positioning accuracy in the single camera. We can also see that the stitched areas of the camera overlap regions are also seamless and smooth without a local correction after relative orientation with translation transformation.The stitching result will not be affected by the overlapping region terrain, which is benefited from the geometric characteristics we talk about in the following part.

RFM Fitting Precision
The RFM fitting precision is one key index of the geometry quality for the complete stitched image, and it can be calculated using check points evenly distributed among the virtual control points when fitting the RFM based on the rigorous imaging model as shown in Figure 6 with the results shown in Table 3. Areas 4 and 5 could achieve seamless mosaicking when feather processing is performed for the detector overlapping regions. That is due to the high accuracy and stable on-orbit calibrated interior parameters, which are the most important factors for the consistency of the positioning accuracy in the single camera. We can also see that the stitched areas of the camera overlap regions are also seamless and smooth without a local correction after relative orientation with translation transformation. The stitching result will not be affected by the overlapping region terrain, which is benefited from the geometric characteristics we talk about in the following part.

RFM Fitting Precision
The RFM fitting precision is one key index of the geometry quality for the complete stitched image, and it can be calculated using check points evenly distributed among the virtual control points when fitting the RFM based on the rigorous imaging model as shown in Figure 6 with the results shown in Table 3. The maximum and minimum error absolute values for the rational function coefficients (RPCs) fit using the original attitude reaches approximately 0.65 pixels in the sample and 0.93 pixels in the line directions, respectively. However, the RMSE of the RPC fit using the smooth attitude becomes better than 1.00×10 -4 in the sample and line directions, which may be ignored because it fully achieves the accuracy requirement. This is due to the virtual detector of the big virtual camera being designed without internal distortion and external attitude data in the steady-state processing, which are interpolated using the polynomial model that can filter the observation noise using the least square method (LSM). The high RFM accuracy for the complete stitched image obtained using the big virtual camera provides the necessary geometric information for post processing. Mosaic Accuracy The mosaic accuracy we focus on can be divided into the mosaic accuracy of the detector overlap region and the camera overlap region. Due to the collinear installation of the multi-TDI-CCD detectors, the same scanning line in the detector overlap region is concurrently imaged; then, the  The maximum and minimum error absolute values for the rational function coefficients (RPCs) fit using the original attitude reaches approximately 0.65 pixels in the sample and 0.93 pixels in the line directions, respectively. However, the RMSE of the RPC fit using the smooth attitude becomes better than 1.00 × 10 −4 in the sample and line directions, which may be ignored because it fully achieves the accuracy requirement. This is due to the virtual detector of the big virtual camera being designed without internal distortion and external attitude data in the steady-state processing, which are interpolated using the polynomial model that can filter the observation noise using the least square method (LSM). The high RFM accuracy for the complete stitched image obtained using the big virtual camera provides the necessary geometric information for post processing.

Mosaic Accuracy
The mosaic accuracy we focus on can be divided into the mosaic accuracy of the detector overlap region and the camera overlap region. Due to the collinear installation of the multi-TDI-CCD detectors, the same scanning line in the detector overlap region is concurrently imaged; then, the auxiliary data are concurrently interpolated by time. Based on the high-accuracy camera interior LOS parameters, the mosaic accuracy of the detector overlap region can stably reach approximately zero pixels. Therefore, feather processing is needed for seamless mosaicking. The mosaic accuracy of the camera overlap region is difficult to ensure using an on-orbit calibration due to the inconsistent relative installation relationship of the double-cameras caused by the changing thermal environment. Therefore, the relative orientation should be performed for the double-images captured by the double-cameras. Theoretically speaking, affine transformation should be performed in the relative orientation considering the rigorous installation relationship. However, the small imaging overlap region of the double-cameras and the complicated imaging terrain features cause the homonymy points auto-matching to be difficult and not completely reliable in most cases. Therefore affine transformation will probably cause a more unreliable geometric accuracy for the double images, especially at the edge of the double-images away from the overlapping region. Considering these problems, translation transformation may be an alternative choice if it can also achieve the ideal mosaic accuracy.
Based on the original RFM of virtual single detector, the adjacent single images can be projected to the big virtual detector according to Step 3 in the workflow. Subsequently, the mosaic accuracy can be evaluated using the consistency of the homonymy points in the overlapping regions. More than ten thousand evenly distributed dense homonymy points were matched automatically using the SIFT algorithm. (s,l) left is the image coordinate of the matched point in the left single virtual image, and (s,l) right is the image coordinate of the matched corresponding homonymy point in the adjacent right single virtual image. Because the homonymy points image coordinates of the left and right single virtual image are in the same big virtual detector plane coordinate system, the homonymy points image coordinates should be the same under ideal conditions. The deviation of the homonymy points' image coordinates will directly affect and reflect the mosaic accuracy. The mosaic errors (pixels) relationship with the line number are shown in Figure 7. Significant mosaic errors occur across and along the track direction. The mosaic error fluctuations for the four scenes in the camera overlap region are small, which verifies the consistency of the auxiliary data measurements accuracy in the small imaging time interval of the double-cameras caused by the small displacement installation. The mosaic errors of scenes A and B in the same orbit are approximately the same, and they have significant differences compared to scenes C and D, which were imaged on a different day. Considering the use of high accuracy elevation information, such as GDEM2 for generating the virtual image, the inconsistent mosaic errors are probably caused by the change of the relative installation relationship for the double-cameras. Based on the densely distributed homonymy points in the camera overlap region, translation and affine transformations for the updated RFMs of the single camera are performed using the method in Step 4 of the workflow. The affine model based on RFM was used as the exterior orientation model [24,25]. Based on the updated RPMs, the mosaic accuracy statistics are re-calculated, and the results are shown in Table 4.
x + a 0 + a 1 x + a 2 y = RFM x (lat, lon, h) y + b 0 + b 1 x + b 2 y = RFM y (lat, lon, h) (12) points auto-matching to be difficult and not completely reliable in most cases. Therefore affine transformation will probably cause a more unreliable geometric accuracy for the double images, especially at the edge of the double-images away from the overlapping region. Considering these problems, translation transformation may be an alternative choice if it can also achieve the ideal mosaic accuracy. Based on the original RFM of virtual single detector, the adjacent single images can be projected to the big virtual detector according to Step 3 in the workflow. Subsequently, the mosaic accuracy can be evaluated using the consistency of the homonymy points in the overlapping regions. More than ten thousand evenly distributed dense homonymy points were matched automatically using the SIFT algorithm. (s,l)left is the image coordinate of the matched point in the left single virtual image, and (s,l)right is the image coordinate of the matched corresponding homonymy point in the adjacent right single virtual image. Because the homonymy points image coordinates of the left and right single virtual image are in the same big virtual detector plane coordinate system, the homonymy points image coordinates should be the same under ideal conditions. The deviation of the homonymy points' image coordinates will directly affect and reflect the mosaic accuracy. The mosaic errors (pixels) relationship with the line number are shown in Figure 7. Significant mosaic errors occur  In Table 4, the mosaic error across and along the track directions after translation and affine transformations can be significantly reduced. There is no significant advantage for using the affine transformation compared to the translation transformation, thus, we can speculate that the relative installation change is primarily caused by the changing pitch and roll angles, while the yaw angle change is very small and can be ignored. In addition, a simple calculation shows that the plane mosaic accuracy is better than one pixel, and feather processing is also suitable for the seamless mosaicking of the camera overlap region after the relative orientation is determined.

Positioning Accuracy
Based on the updated RFMs obtained in Step 4, the complete stitched image can be generated using Step 5 and Step 6 in the workflow. To evaluate the performance of the proposed image mosaicking approach for maintaining the image position accuracy, the absolute and relative positioning accuracy for the complete stitched image and the single camera image from the single camera with the original RFM was analyzed. We extracted the corresponding DOM and DEM, and 64 GCPs for the complete stitched image and 32 GCPs for the single camera image were matched to evaluate the positioning accuracy. The absolute positioning accuracy is the image positioning accuracy without GCPs, and all GCPs were performed as check points to evaluate the absolute positioning accuracy. The internal relative positioning accuracy can be evaluated using the positioning accuracy with a few GCPs. An affine transformation of the image was applied based on 4 GCPs in the 4 corners of the image to eliminate the exterior systematic offset of the positioning, and the internal relative accuracy statistics were calculated based on the other GCPs. The RFM affine model was also used as the exterior orientation model [24,25].
The geometric accuracy evaluation results of the stitched images are listed in Table 5. The absolute positioning accuracy of the complete virtual image was approximately equal to the absolute positioning accuracy of the two single camera images due to the big virtual detector being placed in the center line of the double-camera detectors along the track direction. Additionally, the relative positioning accuracy of the complete virtual images was also approximately equal to one of the two single camera images. Though approximately 0.1 pixels for the internal relative accuracy loss occur after mosaicking, it still performed well in maintaining the original geometric accuracy. In addition, the internal relative accuracy achieved by the affine transformation is slightly better than the translation transformation, which benefits from the well matched dense homonymy points in the camera overlap region. However, in most cases, due to cloud coverage and complicated terrain, dense evenly distributed homonymy points are difficult to obtain for the coefficient calculations for the high accuracy affine transformation, which will lead to instability of the image edge away from the camera overlap region. However, the translation transformation is a much simpler method, and it can be achieved using a few homonymy points and has better practicability, therefore, it is the optimal method in the relative orientation. Thus, we conclude that the proposed mosaicking approach has no considerable negative effects on the absolute and relative positioning accuracy.

Conclusions and Future Work
In this paper, we have proposed an automatic image mosaicking approach for the double-camera system on the optical remote sensing satellite GF2, and subtly used the concept of the big virtual camera to obtain the stitched image and the corresponding high accuracy RFM for post concurrent processing. The following conclusion may be drawn from our results:

1)
Based on the rigorous imaging model for the single camera, the rigorous imaging model for the big virtual camera was established, which would exactly apply to the complete stitched image. Additionally, the image mosaic workflow based on the virtual single detector and the big virtual camera were presented in detail. 2) High accuracy camera parameters on the satellite are necessary in the proposed mosaicking approach. Therefore, high accuracy on-orbit geometric calibration is the precondition to guarantee the effectiveness of our approach. 3) Benefiting from the platform stability and the small relative installation error, the mosaic error of the camera overlap region could be controlled in one pixel after an on-orbit high accuracy geometric calibration and the relative orientation, otherwise, a local correction may be required to achieve seamless mosaicking. 4) Cloud coverage and complex terrain in the camera overlap region may influence the homonymy point matching. Although we verified the ability of the translation transformation in the relative orientation to reduce the dependence on the homonymy point quantity and quality, it still cannot be applied to all situations. 5) The relative installation instability of the double-cameras and the high-frequency platform vibration are key factors in achieving the seamless mosaic from the double-cameras, therefore, if they can be securely attached in the future satellite platform, a simpler workflow without relative orientation can achieve a more ideal mosaic result.