FPGA-Based On-Board Geometric Calibration for Linear CCD Array Sensors

With increasing demands in real-time or near real-time remotely sensed imagery applications in such as military deployments, quick response to terrorist attacks and disaster rescue, the on-board geometric calibration problem has attracted the attention of many scientists in recent years. This paper presents an on-board geometric calibration method for linear CCD sensor arrays using FPGA chips. The proposed method mainly consists of four modules—Input Data, Coefficient Calculation, Adjustment Computation and Comparison—in which the parallel computations for building the observation equations and least squares adjustment, are implemented using FPGA chips, for which a decomposed matrix inversion method is presented. A Xilinx Virtex-7 FPGA VC707 chip is selected and the MOMS-2P data used for inflight geometric calibration from DLR (Köln, Germany), are employed for validation and analysis. The experimental results demonstrated that: (1) When the widths of floating-point data from 44-bit to 64-bit are adopted, the FPGA resources, including the utilizations of FF, LUT, memory LUT, I/O and DSP48, are consumed at a fast increasing rate; thus, a 50-bit data width is recommended for FPGA-based geometric calibration. (2) Increasing number of ground control points (GCPs) does not significantly consume the FPGA resources, six GCPs is therefore recommended for geometric calibration. (3) The FPGA-based geometric calibration can reach approximately 24 times faster speed than the PC-based one does. (4) The accuracy from the proposed FPGA-based method is almost similar to the one from the inflight calibration if the calibration model and GCPs number are the same.


Introduction
Geometric calibration is one of the most important steps for quality control in high-resolution optical satellite imagery [1][2][3] and is a prerequisite for high-accuracy of direct georeferencing of remotely sensed images [4,5]. Previous researchers have made efforts in inflight geometric calibration on the basis of PC computers in the past decades. With increasing demands for real-time or near real-time remotely sensed imagery in applications such as military deployments, quick response to terrorist attacks and disaster rescue (e.g., flooding monitoring), the on-board implementation of geometric calibration is has been attracting many scientists' interest worldwide in recent years. All of the functional modules in Figure 1 are managed by the Timing Control module. The Template Images Selection module is used for selection of target template images from a template image library that is created from many template images, each of which has a unique ID number. The principle of this module is carried out through matching between the geodetic coordinates centralized at an imaged area and the coordinates of a georeferenced template image. The Image Matching module accurately determines the coordinates of spaceborne image and geodetic coordinates of template images by matching their sub-image windows. The Initial external orientation elements (EOEs) module computes the initial values of EOEs of the spaceborne sensor. The Bundle Adjustment module accurately computes six external orientation elements (EOEs). Due to the resource limitation of a FPGA chip, it usually applies external memory to store multiple template images and ground control points (GCPs) data. Random Access Memory (RAM) is used to store the data stream of a template image and/or an imaged scene temporarily. The line buffers of image data are generated using multiple RAMs for further image matching.
This paper presents the details of FPGA-based implementation of on-board geometric calibration. The input parameters (i.e., GCPs and initial EOEs) are assumed to be directly provided by other modules (for the details, please reference to the Huang and Zhou [7]). The paper is organized as follows: Section 2 overviews previously relevant efforts; Section 3 gives the detailed FPGA-based implementation of on-board geometric calibration. Section 4 describes the validation and the experimental results. The conclusions are drawn up in Section 5. All of the functional modules in Figure 1 are managed by the Timing Control module. The Template Images Selection module is used for selection of target template images from a template image library that is created from many template images, each of which has a unique ID number. The principle of this module is carried out through matching between the geodetic coordinates centralized at an imaged area and the coordinates of a georeferenced template image. The Image Matching module accurately determines the coordinates of spaceborne image and geodetic coordinates of template images by matching their sub-image windows. The Initial external orientation elements (EOEs) module computes the initial values of EOEs of the spaceborne sensor. The Bundle Adjustment module accurately computes six external orientation elements (EOEs). Due to the resource limitation of a FPGA chip, it usually applies external memory to store multiple template images and ground control points (GCPs) data. Random Access Memory (RAM) is used to store the data stream of a template image and/or an imaged scene temporarily. The line buffers of image data are generated using multiple RAMs for further image matching.
This paper presents the details of FPGA-based implementation of on-board geometric calibration. The input parameters (i.e., GCPs and initial EOEs) are assumed to be directly provided by other modules (for the details, please reference to the Huang and Zhou [7]). The paper is organized as follows: Section 2 overviews previously relevant efforts; Section 3 gives the detailed FPGA-based implementation of on-board geometric calibration. Section 4 describes the validation and the experimental results. The conclusions are drawn up in Section 5.

Relevant Efforts
Traditional geometric calibration methods have been investigated for several decades and a number of papers have been published in the computer vision [8][9][10], image processing [11,12], robotic vision [13,14] and photogrammetry communities. However, geometric calibration on-board spaceborne implementation has not yet been reported worldwide so far, although in-lab and/or inflight (also called on-orbit) geometric calibrations for various satellites have overwhelmingly been reported in the past two decades [15][16][17], such as SPOT1-5 [18], IKONOS [19], ALOS [20], Orbview-3 [21], IRS-1C [22], GeoEye-1 [23], MOMS-2P [24,25], CBERS-02B [26], TH-1 [27] and ZY-3 [28,29]. Jacobsen [30,31] implemented geometric calibration of the IRS-1C satellite using self-calibration bundle block adjustment with additional parameters. Crespi et al. [23] used the block adjustment method to investigate the inflight calibration of the GeoEye-1 satellite, with which the accuracy reached 3.0 m without GCPs. Lei [28] presented a self-calibration bundle block adjustment on the basis of the known interior orientation parameters with line array CCD. Li [29] and Li et al. [32] proposed a geometric calibration method with step-by-step on the basis of imaging model of the ZY-3 satellite. The experimental results indicated that the calibration accuracy of ZY-3 can meet the requirement of the accuracy of 1:50,000 mapping. Yang et al. [33] also presented an on-orbit geometrical calibration model for ZY-1 02C panchromatic camera. The experimental results demonstrated that the accuracy with or without GCPs is better than 0.3 pixels. Zhang et al. [34] proposed an on-orbit geometric calibration for and a validation of ZY-3 linear array sensors. Wang et al. [3] constructed an on-orbit rigorous geometric calibration model through selecting optimal parameters. The experimental results discovered that the geometric accuracy without GCPs is significantly improved after on-orbit geometric calibration. Cao et al. [35] proposed a named "look-angle calibration method" for on-orbit geometric calibration of ZY-3 satellite imaging sensors. The experimental analysis discovered that the accuracy under the nadir-looking images is higher than ±2.7 m when five GCPs are utilized and the laboratory calibration parameters are provided as initials. Cao et al. [36] proposed a simple and feasible orientation method for calibration of the CCD-detector's look angles in the three-line array cameras of ZY-3. The experimental results discovered that the accuracies in planimetry and height are 3.4 m and 1.6 m with four GCPs, respectively. Wang et al. [37] proposed an on-orbit geometric calibration method for TH-1 three-line-array camera based on a rigorous geometric model. The experiment results demonstrated that the accuracies in both planimetry and height are approximately 6.93 m and 3.96 m, respectively.
Although research in on-board geometric calibration is relatively rare so far, the FPGA-based on-board data processing for spaceborne images has been reported. For instance, the German small satellite BIRD applied an on-board data processing system consisting of DSP, FPGA and network co-processor in on-board implementation of radiation correction, rectification of partial systematic geometric and image classification using neural network algorithm [38]. Surrey Satellite Technology Limited in the UK applied FPGA as an on-board data processing chip in a small satellite [39]. The French used FPGA chip to realize on-board data processing for Pleiades imagery [40]. Stuttgart University in Germany designed an on-board computing system using FPGA for the small satellite Flying Laptop to realize real-time attitude control, housekeeping, data compression and image processing [41]. The U.S. Jet Propulsion Laboratory employed a Xilinx FPGA to realize on-board hyperspectral image classification based on Support Vector Machine (SVM) [42]. González et al. [43] conducted the investigation on an FPGA-based implementation of N-FINDR algorithm for hyperspectral image analysis. Williams et al. [44] also investigated an FPGA-based implementation of real-time cloud detection of spaceborne image. Hihara et al. [45] analyzed an on-board image processing system for hyperspectral imagery. All of the efforts mentioned above have shown a promise for FPGA-based implementation of geometric calibration, which is presented in this paper.

Brief Overview of Geometric Calibration Model
The different linear imaging systems may have their own imaging modes, resulting in the slight difference of the geometric calibration algorithm. This paper takes MOMS-2P as an example to describe the calibration model (for details readers should refer to [46,47]).
The geometric calibration algorithm of the MOMS imaging system was performed at the laboratories of the German Aerospace company DASA (Stuttgart, Germany), where the MOMS was developed and manufactured. A rigorous model of describing geometry calibration is composed of five parameters (also see Table 1): Principal point coordinates of each sensor (x 0 , y 0 ), Rotation parameter κ of CCD array in the image plane, Deviation of the focal length df and distortion parameter k of the sensor curvature. The sensor curvature is modeled by a second order polynomial equation. The parameter k here indicates the along track deviation at the edges of the CCD-array at 3000 pixel distance from the array center, caused by the sensor curvature.

Interior Orientation
Interior orientation is to transform sensor coordinates (i and j in Figure 2) into image coordinates (x and y in Figure 2) and correct the lens distortion (symmetric and tangential) and CCD array curvature distortion. Thus the following tasks should be performed.

•
Define the sensor (called "screen") coordinate system and image coordinate systems, • Apply the principal point offset that should be estimated from an in-lab or in-flight calibration, and • Correct various distortions.

Brief Overview of Geometric Calibration Model
The different linear imaging systems may have their own imaging modes, resulting in the slight difference of the geometric calibration algorithm. This paper takes MOMS-2P as an example to describe the calibration model (for details readers should refer to [46,47]).
The geometric calibration algorithm of the MOMS imaging system was performed at the laboratories of the German Aerospace company DASA (Stuttgart, Germany), where the MOMS was developed and manufactured. A rigorous model of describing geometry calibration is composed of five parameters (also see Table 1): Principal point coordinates of each sensor (x0, y0), Rotation parameter κ of CCD array in the image plane, Deviation of the focal length df and distortion parameter k of the sensor curvature. The sensor curvature is modeled by a second order polynomial equation. The parameter k here indicates the along track deviation at the edges of the CCD-array at 3000 pixel distance from the array center, caused by the sensor curvature. Interior orientation is to transform sensor coordinates (i and j in Figure 2) into image coordinates (x and y in Figure 2) and correct the lens distortion (symmetric and tangential) and CCD array curvature distortion. Thus the following tasks should be performed.


Define the sensor (called "screen") coordinate system and image coordinate systems,  Apply the principal point offset that should be estimated from an in-lab or in-flight calibration, and  Correct various distortions.

Transformation from Image to Reference Coordinate System
Each of the fore-, nadir-and aft-looking array has its own image coordinate system (xc, yc and zc, right-handed). An image reference coordinate system (xR, yR and zR, right-handed) is defined to unify image coordinates from all three arrays ( Figure 3). The transformation from an image coordinate

Transformation from Image to Reference Coordinate System
Each of the fore-, nadir-and aft-looking array has its own image coordinate system (x c , y c and z c , right-handed). An image reference coordinate system (x R , y R and z R , right-handed) is defined to unify image coordinates from all three arrays ( Figure 3). The transformation from an image coordinate system to the reference coordinate system involves a translation (d x , d y and d z ) and three rotations (ω, ϕ and κ). We define a counterclockwise rotation angle as positive. The transformation equation is: where: Sensors 2018, 18, x FOR PEER REVIEW 5 of 19 system to the reference coordinate system involves a translation (dx, dy and dz) and three rotations (ω, φ and κ). We define a counterclockwise rotation angle as positive. The transformation equation is: where:  Figure 3. From image coordinate to image reference coordinate system.

Geometric Calibration Model
For any image point within a CCD array, its image reference coordinates are (xR, yR and zR). The coordinates of the exposure center of the array in the ground coordinate system at the imaging epoch t are (XC(t), YC(t), ZC(t)). The corresponding ground point coordinates are (XG, YG, ZG). The collinearity condition states that all of the three points must lie on the same straight line: where rij (i, j = 1, 2, 3) are the elements of rotation matrix κ(t) are defined for each CCD array at the epoch t.
A separate image coordinate system is defined for each CCD array which is related to an image reference coordinate system by a 3D transformation. They are a focal length f, principal point offset (x0, y0), and a curvature parameter Kc. The sensor coordinates (i, j) are transformed to image coordinates in the following steps (the details can be referenced to [46,47]):


Step 1: Transformation from sensor/screen coordinates to image coordinates by Step 2: CCD curvature correction by cx Kc y y      ;


Step 3: Lens distortion correction is unavailable;  Step 4: Final image coordinates computed by 0 x cx x  , 0 y y y  .

Geometric Calibration Model
For any image point within a CCD array, its image reference coordinates are (x R , y R and z R ). The coordinates of the exposure center of the array in the ground coordinate system at the imaging epoch t are (X C (t), Y C (t), Z C (t)). The corresponding ground point coordinates are (X G , Y G , Z G ). The collinearity condition states that all of the three points must lie on the same straight line: where r ij (i, j = 1, 2, 3) are the elements of rotation matrix R R G = R R C (ϕ(t), ω(t), κ(t)); ϕ(t), ω(t) and κ(t) are defined for each CCD array at the epoch t.
A separate image coordinate system is defined for each CCD array which is related to an image reference coordinate system by a 3D transformation. They are a focal length f, principal point offset (x 0 , y 0 ), and a curvature parameter Kc. The sensor coordinates (i, j) are transformed to image coordinates in the following steps (the details can be referenced to [46,47]):
To reduce the calculation budget and save the FPGA resources, a first-order polynomial equation is adopted to compute the six EOEs, i.e.: The initial values of EOEs are provided by DLR, Germany. In one epoch t, Equation (3) is linearized by Taylor Series. Substitute Equation (4) into Equation (3), and then linearize it by Taylor Series, it yields: The vector form of Equation (5) is described as: The detailed derivation of A t can be found in [46,47]. When the number of GCPs are greater than 5, the observation equation is constructed as follows: n represents the number of GCPs. The Equation (8) is solved by least square algorithm, and the solutions are: The solution for Equation (9) is obtained by an iterative process. A flowchart of the entire algorithm is presented in Figure 4. To reduce the calculation budget and save the FPGA resources, a first-order polynomial equation is adopted to compute the six EOEs, i.e.: where The vector form of Equation (5) is described as: The detailed derivation of At can be found in [46,47]. When the number of GCPs are greater than 5, the observation equation is constructed as follows: L ln] T , n represents the number of GCPs. The Equation (8) is solved by least square algorithm, and the solutions are: The solution for Equation (9) is obtained by an iterative process. A flowchart of the entire algorithm is presented in Figure 4.   As seen from Figure 4, the initial data are first used to compute the rotate matrix (R R G ), which are used for computation of the coefficient matrix, A and constant matrix, L. The inversion of (A T A) is computed by an LDL T algorithm. The solution, X, is obtained by the iteration process and is compared to a given threshold. If the increments of X are less than the given threshold, then computation ends, and the X is considered as the final solution. Otherwise, the above computation is repeated, until the increments are less than the given threshold.

Design for On-Board Geometric Calibration
An FPGA-based implementation for on-board geometric calibration is proposed in Figure 5, which consists of four modules: Input Data, Coefficient Calculation, Adjustment Computation, and Comparison. The details of the four modules are described as follows: (1) The initial data and the updating data are stored in RAM of the Input Data module. When receiving an enable signal, the data are sent to the Coefficient Calculation module at the same clock cycle. (2) The elements of matrixes A and L (in Equation (8)) are calculated by the Coefficient Calculation module, and the computed results are sent to the Adjustment Computation module at the same clock cycle. (3) The solution X in Equation (9) is calculated by matrixes A and L in the Adjustment Computation module. (4) If the increments of solution X meet the requirement of a given threshold, the iteration computation is terminated, and the solution a 0 , b 0 , c 0 , d 0 , e 0 , f 0 , d 1 , e 1 and f 1 are outputted. Otherwise, the X is updated and the iteration is recomputed until the increments of the solutions meet the requirement of the given threshold. As seen from Figure 4, the initial data are first used to compute the rotate matrix ( ), which are used for computation of the coefficient matrix, A and constant matrix, L. The inversion of (A T A) is computed by an LDL T algorithm. The solution, X, is obtained by the iteration process and is compared to a given threshold. If the increments of X are less than the given threshold, then computation ends, and the X is considered as the final solution. Otherwise, the above computation is repeated, until the increments are less than the given threshold.

Design for On-Board Geometric Calibration
An FPGA-based implementation for on-board geometric calibration is proposed in Figure 5, which consists of four modules: Input Data, Coefficient Calculation, Adjustment Computation, and Comparison. The details of the four modules are described as follows: (1) The initial data and the updating data are stored in RAM of the Input Data module. When receiving an enable signal, the data are sent to the Coefficient Calculation module at the same clock cycle. (2) The elements of matrixes A and L (in Equation (8)) are calculated by the Coefficient Calculation module, and the computed results are sent to the Adjustment Computation module at the same clock cycle. (3) The solution X in Equation (9) is calculated by matrixes A and L in the Adjustment Computation module. (4) If the increments of solution X meet the requirement of a given threshold, the iteration computation is terminated, and the solution a0, b0, c0, d0, e0, f0, d1, e1 and f1 are outputted. Otherwise, the X is updated and the iteration is recomputed until the increments of the solutions meet the requirement of the given threshold.

FPGA-Based Parallel Computation for Matrixes , lt and At
Because the elements (r11 through r33) in the rotation matrix involves sine and cosine functions of three rotational angles, φ, ω and κ, which is time/resources-consuming, a parallel computation method is presented in Figure 6a, where the implementations of sine and cosine functions are carried out by a CORDIC IP core. To ensure that all of the intermediate results are outputted at the same clock cycle, the delay units are adopted for some intermediate results. This module includes 12 multipliers, 2 adders and 2 subtractors.
For computation of lt in Equation (6), the initial data stored in the Input Data module and the rotation matrix are used to compute the lx and ly, which are presented in Figure 6b. The delay units are used to ensure that the final results are outputted at the same clock cycle. 3.2.2. FPGA-Based Parallel Computation for Matrixes R R G , l t and A t Because the elements (r 11 through r 33 ) in the rotation matrix R R G involves sine and cosine functions of three rotational angles, ϕ, ω and κ, which is time/resources-consuming, a parallel computation method is presented in Figure 6a, where the implementations of sine and cosine functions are carried out by a CORDIC IP core. To ensure that all of the intermediate results are outputted at the same clock cycle, the delay units are adopted for some intermediate results. This module includes 12 multipliers, 2 adders and 2 subtractors.
For computation of l t in Equation (6), the initial data stored in the Input Data module and the rotation matrix are used to compute the l x and l y , which are presented in Figure 6b. The delay units are used to ensure that the final results are outputted at the same clock cycle. Six elements in matrix A t , i. e., a 11i , a 12i , a 13i , a 21i , a 22i , a 23i , are computed in parallel, where 18 multipliers, 1 divider and 6 subtractors are employed (see Figure 6c). The rest of the elements in matrix A t , i. e., a 14i , a 15i , a 16i , a 17i , a 18i , a 19i , a 24i , a 25i , a 26i , a 27i , a 28i , a 29i , are also computed in parallel, which includes 39 multipliers, 10 subtractors, 3 adders, and 1 divider (see Figure 6d). Six elements in matrix At, i.e., a11i, a12i, a13i, a21i, a22i, a23i, are computed in parallel, where 18 multipliers, 1 divider and 6 subtractors are employed (see Figure 6c). The rest of the elements in matrix At, i. e., a14i, a15i, a16i, a17i, a18i, a19i, a24i, a25i, a26i, a27i, a28i, a29i, are also computed in parallel, which includes 39 multipliers, 10 subtractors, 3 adders, and 1 divider (see Figure 6d).

FPGA-Based Parallel Computations for A T A and A T L
Due to limitation of the FPGA resource, a parallel computation method for A T A in Equation (9) is presented through modifying A T A, i.e.:

FPGA-Based Parallel Computations for A T A and A T L
Due to limitation of the FPGA resource, a parallel computation method for A T A in Equation (9) is presented through modifying A T A, i.e.: With considering the symmetry of A T A, an FPGA-based parallel computation for the upper triangular matrix of A T A is presented and depicted in Figure 7. As seem from Figure 7a, a number of processing elements (PEs) units are employed to reduce the complexity of computation [49]. All of the PE units are with the same structure, i.e., "a 1 b 1 + a 2 b 2 ", which is enlarged in Figure 7b. Similarly, the method for computation of A T L in Equation (8) is the same as that of the A T A.

T nn A A A A A A A A A A A A B A A A A A A A A
With considering the symmetry of A T A, an FPGA-based parallel computation for the upper triangular matrix of A T A is presented and depicted in Figure 7. As seem from Figure 7a, a number of processing elements (PEs) units are employed to reduce the complexity of computation [49]. All of the PE units are with the same structure, i.e., "a1b1 + a2b2", which is enlarged in Figure 7b. Similarly, the method for computation of A T L in Equation (8) is the same as that of the A T A.
where L is a lower triangular matrix, D is a diagonal matrix, and L T is L's transpose. The solutions of L and D are expressed by: With the characteristics of LDL T , d11 and li1 are first calculated, and then dii and lij are calculated on the basis of d11 and li1. In other words, the latter results are calculated on the basis of the formerly computed results. Since Equations (11) and (12) avoid the computation of square root and alleviate the dependency of the data, the modified equations, Equations (11) and (12), are able to speed up the computation [51].
The FPGA-based parallel computation for the LDL T is depicted in Figure 8. As observed from

FPGA-Based Parallel Computation for B −1
Also due to limitation of the FPGA resources, computing matrix inversion, B −1 is implemented through Cholesky decomposition method [50], i.e.: where L is a lower triangular matrix, D is a diagonal matrix, and L T is L's transpose. The solutions of L and D are expressed by: With the characteristics of LDL T , d 11 and l i1 are first calculated, and then d ii and l ij are calculated on the basis of d 11 and l i1 . In other words, the latter results are calculated on the basis of the formerly computed results. Since Equations (11) and (12) avoid the computation of square root and alleviate the dependency of the data, the modified equations, Equations (11) and (12), are able to speed up the computation [51].
The FPGA-based parallel computation for the LDL T is depicted in Figure 8. As observed from Figure 8, it includes 1 driver and 8 PE units. The PE i (i = 1, 2, 3 and 4) are for calculating d ii and l ij . Thus, the computation of B −1 can be divided into two steps: (1) Decompose B into the LDL T ; (2) Compute the inversion of LDL T by:  Summarily, the FPGA-based flowchart of B −1 is depicted in Figure 9, which consists of five parts. Each of the parts is explained in detail as follows.
(1) The first MUX at the left hand of Figure 9 is to construct the column elements of B; (2) The LDL T is to calculate the elements of L and D;  Summarily, the FPGA-based flowchart of B −1 is depicted in Figure 9, which consists of five parts. Each of the parts is explained in detail as follows.
(1) The first MUX at the left hand of Figure 9 is to construct the column elements of B; (2) The LDL T is to calculate the elements of L and D;

Test Area and Data Set
The test area and data set are from DLR, Germany. The data sets were used for the inflight geometric calibration of MOMS-2P. The test area is comprised of Scenes 27 through 30 from southeast of Germany to about 160 km beyond the Austrian border (see Figure 10, in which the area is about 178 × 50 km 2 , and the ground pixel size of imagery is 5.9 m at 390 km orbit height). In the test area, the ground coordinates of 10 GCPs and 24 check points were obtained by topographic maps at a scale of 1:50,000 with an accuracy of 1.5 m in X, Y and Z (see Figure 11, the details can be found in [46,47]).

Test Area and Data Set
The test area and data set are from DLR, Germany. The data sets were used for the inflight geometric calibration of MOMS-2P. The test area is comprised of Scenes 27 through 30 from southeast of Germany to about 160 km beyond the Austrian border (see Figure 10, in which the area is about 178 × 50 km 2 , and the ground pixel size of imagery is 5.9 m at 390 km orbit height).

Test Area and Data Set
The test area and data set are from DLR, Germany. The data sets were used for the inflight geometric calibration of MOMS-2P. The test area is comprised of Scenes 27 through 30 from southeast of Germany to about 160 km beyond the Austrian border (see Figure 10, in which the area is about 178 × 50 km 2 , and the ground pixel size of imagery is 5.9 m at 390 km orbit height). In the test area, the ground coordinates of 10 GCPs and 24 check points were obtained by topographic maps at a scale of 1:50,000 with an accuracy of 1.5 m in X, Y and Z (see Figure 11, the details can be found in [46,47]). In the test area, the ground coordinates of 10 GCPs and 24 check points were obtained by topographic maps at a scale of 1:50,000 with an accuracy of 1.5 m in X, Y and Z (see Figure 11, the details can be found in [46,47]).
They were located in the 4th zone of Gauss-Krueger coordinate system. The navigation data of Orientation Lines (OLs) (the definition of OLs can be found in [46,47]), the ground coordinates of GCPs and the corresponding image coordinates were all provided by Institute of Photogrammetry at University of Stuttgart [53]. Sensors 2018, 18, x FOR PEER REVIEW 12 of 19 Figure 11. Distribution of 10 GCPs and check points in the test field.
They were located in the 4th zone of Gauss-Krueger coordinate system. The navigation data of Orientation Lines (OLs) (the definition of OLs can be found in [46,47]), the ground coordinates of GCPs and the corresponding image coordinates were all provided by Institute of Photogrammetry at University of Stuttgart [53].

Relationship between the Data Width and the Accuracy
Due to the resource limitation of the FPGA chip, this paper attempts to study the relationship between the floating-point data width vs. the accuracy of geometric calibration. The experiment is conducted using 6 GCPs under the floating-point data widths, 44-bit, 48-bit, 50-bit, 54-bit and 64-bit. In accordance with the IEEE standard 754, a floating-point data width consists of the sign part, exponent part and fractional part. The five types of floating-point data widths are experimented, as shown in Table 2.

Relationship between the Data Width and the Accuracy
Due to the resource limitation of the FPGA chip, this paper attempts to study the relationship between the floating-point data width vs. the accuracy of geometric calibration. The experiment is conducted using 6 GCPs under the floating-point data widths, 44-bit, 48-bit, 50-bit, 54-bit and 64-bit. In accordance with the IEEE standard 754, a floating-point data width consists of the sign part, exponent part and fractional part. The five types of floating-point data widths are experimented, as shown in Table 2. With the different data widths, the solutions of the coefficients in Equation (5) computed by the FPGA and the PC computer are presented in Table 3. As seen in Table 3, when the data width increases from 44-bit to 64-bit, the maximum differences of the coefficients, ∆b 0 and ∆c 0 decrease from 2.8684 to 5.8651 × 10 −4 , and from 0.9676 to 0.0021, respectively. The differences of the coefficients, ∆d 0 , ∆e 0 , ∆f 0 , ∆d 1 , ∆e 1 and ∆f 1 , decrease from approximatively 1 × 10 −7 to 1 × 10 −10 when the data wide decrease from 44-bit to 64-bit. When the data width reaches 64-bit, the solutions of 9 coefficients solved by the FPGA and by the PC computer are almost exactly the same. Consequently, the experimental results discover that (1) the differences of geometric calibration parameters solved by FPGA and PC computer are small enough; (2) With increasing the data widths, the accuracy of geometric calibration parameters increases.

Relationship between the Data Width and the Consumption of FPGA Resources
Although an increasing data width can increase the accuracy of the calibration parameters, it consumes more FPGA resources. Thereby, it is necessary to investigate the relationship between the data width and the consumption of FPGA resources. For this reason, the five widths of floating-point data and the consumptions of FPGA resources, including FF, LUT, Memory LUT, I/O and DSP48, are analyzed. The results are displayed in Figure 12. As seen from Figure 12, when the data width increases from 44-bit to 54-bit, the utilization of FF increases from 14.98% to 16.37%, LUT from 61.31% to 75.33%, Memory LUT from 9.32% to 10.86%, I/O from 33.57% to 40.71% and DSP48 from 55.71% to 92.86%. The consumption of DSP48 unit increases from 6 to 10 when data width changes from 50-bit to 54-bit. This means that the DSP48 resources consume very fast when data width increases. When the data width reaches 64-bit, the utilizations of FPGA resources reaches 100%. It can therefore be concluded that the data width with 54-bit is recommended in this paper.  Tables 3 and 4. As observed from Tables 3 and 4, the computational speed at data widths of 44-bit, 48-bit, 50-bit and 54-bit can reach a 217 clock cycle (approximately 0.01736 ms under the clock frequency of 12.5 MHz) when using FPGA, which is 22 times faster than that by the PC-based implementation. However, the computation at the data width of 64-bit is failure to be operated, since the utilization of the DSP48 unit reaches 100% (see Figure 12). Hence, the data width with 64-bit is not recommended.
With the experimental results in Table 3, it can be concluded that when the data width increases from 44-bit to 64-bit, the accuracy can be improved, while the speed of FPGA-based implementation remains consistent, except for the data width of 64-bit. The reasons are that (1) the IP cores with different data widths can be defined as the same clock delay for the results outputting; (2) the larger data width will consume more DSP48 unit, resulting in that the consumption of DSP48 in 64-bit data width reaches over 100% (see Figure 12). Therefore, a width of floating point data with 50-bit is recommended for on-board geometric calibration. In general, the more GCPs, the higher geometric calibration accuracy. However, for a FPGA-based implementation, many GCPs will add the burden of computational time and the consumption of FPGA resources. Thereby, it is necessary to investigate the optimum number of GCPs for on-board geometric calibration. For this reason, this sub-section investigates the different numbers of GCPs (i.e., 5, 6, 8 and 10) vs. computational accuracy when the floating-point data width is fixed at 50-bit. Also for the purpose of comparison, the calibration parameters solved by PC-based Microsoft Visual studio 2015 (C++) is applied as the references under the same GCPs. The differences of the parameters between the FPGA-based and the PC-based computations are shown in Table 5.  Tables 3 and 4. As observed from Tables 3 and 4, the computational speed at data widths of 44-bit, 48-bit, 50-bit and 54-bit can reach a 217 clock cycle (approximately 0.01736 ms under the clock frequency of 12.5 MHz) when using FPGA, which is 22 times faster than that by the PC-based implementation. However, the computation at the data width of 64-bit is failure to be operated, since the utilization of the DSP48 unit reaches 100% (see Figure 12). Hence, the data width with 64-bit is not recommended.
With the experimental results in Table 3, it can be concluded that when the data width increases from 44-bit to 64-bit, the accuracy can be improved, while the speed of FPGA-based implementation remains consistent, except for the data width of 64-bit. The reasons are that (1) the IP cores with different data widths can be defined as the same clock delay for the results outputting; (2) the larger data width will consume more DSP48 unit, resulting in that the consumption of DSP48 in 64-bit data width reaches over 100% (see Figure 12). Therefore, a width of floating point data with 50-bit is recommended for on-board geometric calibration. In general, the more GCPs, the higher geometric calibration accuracy. However, for a FPGA-based implementation, many GCPs will add the burden of computational time and the consumption of FPGA resources. Thereby, it is necessary to investigate the optimum number of GCPs for on-board geometric calibration. For this reason, this sub-section investigates the different numbers of GCPs (i.e., 5, 6, 8 and 10) vs. computational accuracy when the floating-point data width is fixed at 50-bit.
Also for the purpose of comparison, the calibration parameters solved by PC-based Microsoft Visual studio 2015 (C++) is applied as the references under the same GCPs. The differences of the parameters between the FPGA-based and the PC-based computations are shown in Table 5. As seen in Table 5, with increasing number of GCPs, the differences of calibration parameters become smaller and smaller. The maximum difference reaches 0.015 for a 0 , when the number of GCPs is 5; even decreases to 0.008 when the number of GCPs reaches 10. The differences of the other parameters, b 0 , c 0 , d 0 , e 0 , f 0 , d 1 , e 1 and f 1 in Table 5, display a similar phenomenon.

Relationship between the Number of GCPs and the Consumption of FPGA Resources
In order to select an optimum number of GCPs, a relationship between the number of GCPs and the consumption of the FPGA resources is investigated. With the fixed data wide at 50 bit as described above, 5 GCPs, 6 GCPs, 8 GCPs and 10 GCPs are selected to investigate the consumption of FPGA resources, respectively. The results are presented in Table 6. As observed from Table 6, when the number of GCPs varies from 5 to 10, the consumptions of the BRAM, DSP48 and BUFG remain the same, but the consumptions of FF, LUT and Memory LUT increase a little bit. This means that the increasing number of GCPs will not significantly consume the FPGA resources.  Table 7. As observed in Table 7, the computational time changes from 0.0170 ms to 0.0186 ms when the number of GCPs increases from 5 to 10, while the computational time from PC-based computation increases from 0.377 ms to 0.463 ms. In other words, the FPGA-based maximum computation speed can reach approximately 22 times faster than the PC-based does when five GCPs are adopted; the FPGA-based minimum computation speedup can achieve approximately 25 times faster than the PC-based does when the number of GCPs increases to 10. Thereby, it can be concluded that FPGA-based on-board implementation of geometric calibration can reach approximately 24 times faster than the PC-based does.

Accuracy Comparison between FPGA-Based and Inflight-Based Computations
To investigate the relationship between FPGA-based and inflight-based implementations of geometric calibration, the data wide is fixed at 50 bits; the 10 GCPs and 24 check points are selected; and the calibration model, associated with other conditions, are the same. The experimental results are presented in Table 8. As shown in Table 8, the RMS X of 11 m for X-coordinate, RMS Y of 8 m for Y-coordinate and RMS Z of 11 m for Z-coordinate can be reached. The RMS differences (noted, ∆) between the inflight-based and the FPGA-based implementations are 0.16 m for X-coordinate, 0.19 m for the Y-coordinate and 0.11 m for the Z-coordinate, respectively.

Conclusions
This paper first presents an FPGA-based on-board computation and implementation for geometric calibration. A Xilinx Virtex-7 FPGA VC707 board is selected as hardware and the experimental data used for inflight geometric calibration from DLR (Köln, Germany), is employed to validate our method. The main contributions of this paper are as follows.
(1) An FPGA-based on-board geometric calibration computation is designed and implemented.
(2) FPGA-based parallel computation for coefficient matrixes construction (e.g., matrix A and matrix L), matrix multiplication (e.g., A T A and A T L), matrix decomposition (e.g., B = LDL T ), and matrix inversion (e.g., B −1 ) are developed. With the experimental results, it has been demonstrated that the proposed method is able to save large amount of the FPGA resources. (3) From the experimental results, it can be found that: With increasing data width, FPGA resources consume increasingly. For example, the utilization of DSP48 unit suddenly increase from 55.7% to 92.9%, this fact demonstrates that the data width of 64-bit is impropriate for on-board implementation of geometric calibration for the selected FPGA chip due to the limitation of the FPGA resource.
The computation speed executed by the FPGA is approximately 22 times faster than that executed by the PC computer when five GCPs are adopted; and approximately 25 times faster than that executed by the PC computer when 10 GCPs are adopted. It can therefore be concluded that the computing speed executed by the FPGA can reach approximately 24 times faster than that executed by PC computer with Microsoft Visual Studio 2015 (C++). (c) More than 90% of the DSP48 unit resources are consumed when the data width of 54-bit or 64-bit is selected. Considering the limitation of FPGA resources, a width of 50-bit floating point data is recommended, as which it meets the requirement of geometric calibration accuracy.
(d) When the floating-point data width is fixed at 50-bit and the number of GCPs varies from five to 10, the utilizations of the FPGA's BRAM, DSP48, and BUFG remain unchanged, the utilizations of FF, LUT and Memory LUT slightly increase and the computational speed increase a little bit. This means that increasing the number of GCPs will not significantly increase the consumption of the FPGA resources and the computational speed. Therefore, six GCPs are recommended during an on-board geometric calibration.
The proposed method in this paper not only can be used for satellites, but also used for van-based mobile mapping and ground-based robots for speedup of geometric calibration. Moreover, part of the FPGA-based modules developed in this paper, such as matrix multiplication and matrix inversion, can be used for geometric calibration captured by frame cameras as well.