Pushbroom Hyperspectral Data Orientation by Combining Feature-Based and Area-Based Co-Registration Techniques

Direct georeferencing of airborne pushbroom scanner data usually suffers from the limited precision of navigation sensors onboard of the aircraft. The bundle adjustment of images and orientation parameters, used to perform geocorrection of frame images during the post-processing phase, cannot be used for pushbroom cameras without difficulties—it relies on matching corresponding points between scan lines, which is not feasible in the absence of sufficient overlap and texture information. We address this georeferencing problem by equipping our aircraft with both a frame camera and a pushbroom scanner: the frame images and the navigation parameters measured by a couple GPS/Inertial Measurement Unit (IMU) are input to a bundle adjustment algorithm; the output orientation parameters are used to project the scan lines on a Digital Elevation Model (DEM) and on an orthophoto generated during the bundle adjustment step; using the image feature matching algorithm Speeded Up Robust Features (SURF), corresponding points between the image formed by the projected scan lines and the orthophoto are matched, and through a least-squares method, the boresight between the two cameras is estimated and included in the calculation of the projection. Finally, using Particle Image Velocimetry (PIV) on the gradient image, the projection is deformed into a final image that fits the geometry of the orthophoto. We apply this algorithm to five test acquisitions over Lake Geneva region (Switzerland) and Lake Baikal region (Russia). The results are quantified in terms of Root Mean Square Error (RMSE) between matching points of the RGB orthophoto and the pushbroom projection. From a first projection where the Interior Orientation Parameters (IOP) are known with limited precision and the RMSE goes up to 41 pixels, our geocorrection estimates IOP, boresight and Exterior Orientation Parameters (EOP) and produces a new projection with an RMSE, with the reference orthophoto, around two pixels.


Introduction
Pushbroom cameras prevail in hyperspectral remote sensing because they usually offer much better spectral resolution and much higher number of bands than frame cameras.The usual method to collect and georeference pushbroom data is to equip the aircraft with navigation sensors like GPS and Inertial Measurement Units (IMU), fuse the data from both of these sensors via Kalman Filtering, and perform direct georeferencing [1]-knowing the position and attitude of the vehicle at the time of acquisition, data can be projected on a constant height field or a Digital Elevation Model (DEM) using the collinearity equation, which relates the pixel coordinates of the data to the ground coordinates of the points where they were acquired, in a geographic or projected reference system.However, the precision obtained by this method is affected by the quality of the navigation data [2], as well as the intrinsic calibration of the imaging sensor [3].Due to this, the ground error of the georeferencing can be several metres to tens of metres, depending on the altitude of the aircraft and the Ground Sampling Distance (GSD) [4,5].To improve the quality of the georeferencing, several ancillary data can complement the navigation data.The measurement, on the field, of Ground Control Points (GCP) helps correcting the absolute orientation of the airborne images [6,7].They can be input to a bundle adjustment algorithm to correct the navigation data of all images as well as build an orthomosaic of the whole area flown.Manually input tie points for given pairs of images can refine the adjustment and improve the relative orientation of the images.The bundle adjustment theory is, however, of little help in the case of pushbroom imagery: three GCPs per scan line would be required to georeference them properly [8]; the automatic tie points matching between scan lines is extremely difficult because it usually relies on variations of the SIFT algorithm [9] and consequently the 2D neighbourhoods of the points, which are not necessarily available in a 1D acquisition.To bypass these constraints, extra assumptions can be made about the trajectory of the vehicle.More specifically, assuming smoothness of the orientation parameters, the complexity of the problem can be reduced down to the determination of Gauss-Markov processes for such parameters [10,11].
Another possible approach is the use of an existing georegistered reference image.One then attempts to match features from a grayscale version of the hyperspectral scan lines to the grayscale reference image.While the literature reports attempts of such matching solely based on the correlation criteria (without any navigation data) [12], having a first orthoimage by projecting the scan lines using the navigation parameters, and then applying matching algorithms, is a much more robust and reliable procedure.The most noticeable example is a work carried out by Cariou & Chehdi [13]: an image is formed from the pushbroom scan lines and georeferenced using the navigation parameters from the GNSS/IMU system.The reference image values at the ground points where the scan lines were projected are backward projected to image coordinates, and matching from the backward projection to the image formed by the scan lines is performed with a mutual information criterion.
In the following, we present a new method, based on the co-registration onto a RGB reference, that uses both feature matching and area matching methods.We first introduce, in Section 1.1, the scientific context of the study, namely the Leman-Baikal project, the technical setup, and the problem we wish to solve.Section 1.2 references important known contributions in the domain of pushbroom data georeferencing and co-registration to RGB data.Section 2 details the processing we use to perform and improve the co-registration of our RGB data and pushbroom hyperspectral data: a bundle adjustment-including all the RGB images and the filtered exterior orientation parameters-is processed, allowing the creation of a reference orthomosaic and a DEM, and a first correction of the orientation parameters for the pushbroom scanner.The scan lines are then co-registered on the orthomosaic, thus producing an orthoimage that is distorted, compared to the reference.Then, the use of Speeded Up Robust Features (SURF) to find tie points between the reference orthomosaic and the projection allows compensating for the systematic error.To further refine the quality of the co-registration, local distortions are estimated by finding the maximum of a normalised cross-correlation criterion.Based on the estimated distortions, a least-squares optimisation algorithm computes the best fitting EOP; the final projection is obtained by projecting the scan lines with the newly estimated parameters.Results, in terms of quality of the co-registration (measured with RMSE) and new orientation parameters, are analysed in Section 3.

The Leman-Baikal Project
From 2013 to 2015, several research laboratories of the École Polytechnique Fédérale de Lausanne (Switzerland) and the Lomonosov Moscow State University (Russia) collaborated to study the water quality of both Lake Geneva (Switzerland), also called "Lac Léman" in French, and Lake Baikal (Russia).One of the main goals of this project, named the Leman-Baikal project, was to develop a hyperspectral sensing system to be embedded on a ultralight plane, to collect and use hyperspectral data to measure parameters like concentration of chlorophyll-α or turbidity.The system used for the acquisition consisted in an SBG Ekinox2-N Integrated GPS+IMU system, a Headwall Micro Hyperspec VNIR-A hyperspectral camera, an IDS-Imaging USB 2 uEye SE UI-2280 SE RGB camera, and an Intel NUC to handle the communication between devices.To allow proper use of the hyperspectral data collected, the post-processing chain contains two stages:

•
Geometric correction: the scan lines collected by the pushbroom scanner shall be properly georeferenced.

•
Radiometric correction: the output spectra shall be filtered for inherent noise, and corrected for atmospheric and water-surface reflection effects.
The present article deals with the geometric correction problem.For any further information about the Leman-Baikal project, the reader may refer to [14].

Problem Formulation
Considering the setting given in Section 1.1.1 and illustrated in Figure 1, we wish to:

•
Compute a georeferenced orthomosaic composed with the scan lines collected by the hyperspectral pushbroom sensor.

•
Retrieve the corrected orientation parameters.They include Interior Orientation Parameters (IOP: principal distance, principal point of the pushbroom camera, and potentially the distortion parameters of the lens) and Exterior Orientation Parameters (EOP: roll, pitch, yaw, and three position parameters) for all the scan lines.

•
Estimate the boresight between the IMU and the pushbroom sensor.
Remote Sens. 2018, 10, x 3 of 21 parameters like concentration of chlorophyll-α or turbidity.The system used for the acquisition consisted in an SBG Ekinox2-N Integrated GPS+IMU system, a Headwall Micro Hyperspec VNIR-A hyperspectral camera, an IDS-Imaging USB 2 uEye SE UI-2280 SE RGB camera, and an Intel NUC to handle the communication between devices.To allow proper use of the hyperspectral data collected, the post-processing chain contains two stages: • Geometric correction: the scan lines collected by the pushbroom scanner shall be properly georeferenced.• Radiometric correction: the output spectra shall be filtered for inherent noise, and corrected for atmospheric and water-surface reflection effects.
The present article deals with the geometric correction problem.For any further information about the Leman-Baikal project, the reader may refer to [14].

Problem Formulation
Considering the setting given in Section 1.1.1 and illustrated in Figure 1, we wish to: • Compute a georeferenced orthomosaic composed with the scan lines collected by the hyperspectral pushbroom sensor.• Retrieve the corrected orientation parameters.They include Interior Orientation Parameters (IOP: principal distance, principal point of the pushbroom camera, and potentially the distortion parameters of the lens) and Exterior Orientation Parameters (EOP: roll, pitch, yaw, and three position parameters) for all the scan lines.• Estimate the boresight between the IMU and the pushbroom sensor.The latitude, longitude and altitude, relatively to the WGS84 Geoid, and attitude as roll, pitch and yaw output by the GPS+IMU system are considered as initial estimates, which our geocorrection The latitude, longitude and altitude, relatively to the WGS84 Geoid, and attitude as roll, pitch and yaw output by the GPS+IMU system are considered as initial estimates, which our geocorrection algorithm should refine.Additionally, we want the algorithm to not depend on the measure of GCPs: measuring GCPs is a time consuming task, and using the information they provide for pushbroom scan lines would imply to make assumptions on the trajectory of the aircraft to reduce the complexity of the problem.

State of the Art
Georeferencing scan lines from a pushbroom sensor can be achieved with various setups and assumptions on the motion of the scanner.The most common approach is Integrated Sensor Orientation (InSO), involving the use of GCPs together with a model of the trajectory of the airborne vehicle/satellite.Such contributions include [10], assuming constant biases of roll, pitch and yaw angles across the scan lines, and estimating these biases using control points and Aerotriangulation (AT).In [15], the EOP of either a satellite or an airborne vehicle are modelled by piecewise polynomials, and optimised together with ephemeris or GPS/IMU data and GCPs using a least squares method.Some additional parameters like the lever-arm between the GPS antenna and the pushbroom scanner, the boresight between the IMU and the scanner, or the non-uniform geometry of the CCD array can also be optimised by a least square method [16].Rigorous models, including the effects of atmospheric refraction and potential cartographic transformation, help the georeferencing of satellite data to be more robust to GCP outliers [17].Using the co-registration to pre-existing reference orthoimages, relying on point features or linear features, can replace the measurements of GCPs [18].Simple methods, like the identification of a straight line, coupled with GCPs, even allow the ability to obtain a decent mosaic from the scan lines [19], though such method obviously does not provide any information on the vehicle's trajectory.However, if no assumption on the smoothness of the trajectory is made, estimating the EOP for each scan line would imply to measure an unrealistically high number of GCPs [8].To alleviate reliance on GCPs, a significant amount of contributions make use of the co-acquisition of frame images, and develop methods to co-register the frame images and the scan lines.The most sophisticated algorithm seems to be given by [20]: the interior orientation of the pushbroom scanner is calibrated and the sensor is aligned with the frame camera using a target not parallel to the plane of the CCD array, and then the stitching of the frame images allows the construction of the hyperspectral mosaic.Additionally, this method does not require any navigation data, however, it supposes either the calibration with the triangle target prior to every flight, or the stability of the relative orientation of the two sensors.The ultralight plane flights operated during the Leman-Baikal project were typically two to three hours long, therefore such stability cannot be assumed.Approaches involving the co-acquisition of frame images, scan lines and navigation data perform well [21].The existence of frame images avoids the problem of weak geometry of the scan lines; the bundle adjustment of frame images outputs better orientation for said images, which in turn can be used as orientation for the pushbroom scanner, by estimating the boresight between the two cameras and defining an interpolation scheme to compensate for the difference of acquisition frequencies [22].Among other contributions, [13] proposes a co-registration to an existing referencing using a mutual information criterion, and explores different scenarios of missing/inaccurate orientation data.
Considering the existing contributions, the originality of the work presented in the following sections is that the problem of georeferencing pushbroom scan lines is solved using navigation data and co-acquired frame images, without needing: • GCPs

•
A model for the vehicle's trajectory

•
A priori knowledge on the misalignment between the IMU and the pushbroom scanner, which is important since, for low to average quality IMUs, the initial attitude alignment takes an arbitrary wrong value at the beginning of each flight.
Furthermore, the algorithm presented below leads to new estimates of EOP for every scan line independently.

Proposed Methodology
The geocorrection method proposed consists in five steps.First, we process a bundle adjustment of the frame images to retrieve corrected EOP for the frame camera and the navigation sensors; the adjustment allows the creation, in parallel, of a DEM of the area.Second, using these EOP for the pushbroom camera, a first mosaic is created by projecting the scan lines on the DEM.Third, points are matched between the orthophoto from the frame images and the pushbroom mosaic; these matches allow the estimation of the systematic error in the pushbroom georeferencing, in the form of the boresight between the IMU and the pushbroom scanner, and the IOP of the scanner.The projection of the scan lines is updated using the estimated boresight parameters.Fourth, the residual errors are compensated for by computing the local normalised cross correlations between the reference orthophoto and the mosaic.The mosaic is elastically deformed into a new mosaic which fits the orthophoto better.Fifth, these deformations allow better estimation of the EOP for each scan line and computation of a final mosaic that is well co-registered onto the reference orthophoto.The flowchart of this method is given in Figure 2.

Proposed Methodology
The geocorrection method proposed consists in five steps.First, we process a bundle adjustment of the frame images to retrieve corrected EOP for the frame camera and the navigation sensors; the adjustment allows the creation, in parallel, of a DEM of the area.Second, using these EOP for the pushbroom camera, a first mosaic is created by projecting the scan lines on the DEM.Third, points are matched between the orthophoto from the frame images and the pushbroom mosaic; these matches allow the estimation of the systematic error in the pushbroom georeferencing, in the form of the boresight between the IMU and the pushbroom scanner, and the IOP of the scanner.The projection of the scan lines is updated using the estimated boresight parameters.Fourth, the residual errors are compensated for by computing the local normalised cross correlations between the reference orthophoto and the mosaic.The mosaic is elastically deformed into a new mosaic which fits the orthophoto better.Fifth, these deformations allow better estimation of the EOP for each scan line and computation of a final mosaic that is well co-registered onto the reference orthophoto.The flowchart of this method is given in Figure 2.

Pre-Processing Step: Radiometric Matching
Knowing the later steps of our algorithm (Sections 2.4 and 2.5) rely on matching features or areas of the RGB images to the pushbroom hyperspectral images, the radiometric properties of both sources need to match accordingly.The hyperspectral camera is calibrated (spectrally and radiometrically) using a Spectralon panel, which is Lambertian and has a nearly unit-reflectance across the spectral range of our camera (400-850 nm).The band-wise quantum efficiency for each band of the RGB camera is provided by the manufacturer; to produce comparable RGB data from our hyperspectral camera, we integrate the quantum efficiency of one band of the frame camera, times the spectral signal delivered by the hyperspectral camera.Concretely, let λ be the wavelength variable, and QE b (λ) the quantum efficiency of the frame camera in the band b (red, green or blue).Let λ n be the central wavelength of the nth band of the hyperspectral camera.Then, h b , the data in band b synthesised from the hyperspectral image h, is given by Equation (1).

Pre-Processing Step: Radiometric Matching
Knowing the later steps of our algorithm (Sections 2.4 and 2.5) rely on matching features or areas of the RGB images to the pushbroom hyperspectral images, the radiometric properties of both sources need to match accordingly.The hyperspectral camera is calibrated (spectrally and radiometrically) using a Spectralon panel, which is Lambertian and has a nearly unit-reflectance across the spectral range of our camera (400-850 nm).The band-wise quantum efficiency for each band of the RGB camera is provided by the manufacturer; to produce comparable RGB data from our hyperspectral camera, we integrate the quantum efficiency of one band of the frame camera, times the spectral signal delivered by the hyperspectral camera.Concretely, let λ be the wavelength variable, and QE b (λ) the quantum efficiency of the frame camera in the band b (red, green or blue).Let λ n be the central wavelength of the nth band of the hyperspectral camera.Then, h b , the data in band b synthesised from the hyperspectral image h, is given by Equation (1).
A three-bands image is then created by applying Equation (1) to the three bands of the frame camera.Images of both cameras are then converted to greyscale for the following steps of the algorithm.

Bundle Adjustment of Frame Images
In the first stage of the geocorrection, the information available in frame imagery is used to refine the estimates of the EOP given by the navigation sensors.Using the commercial software Agisoft Photoscan, we perform a bundle adjustment of frame images (Figure 3).Each image is associated with a set of EOP, initialised with the values output by the GPS+IMU system.Photoscan handles the stitching of the RGB images by finding tie points; EOP are optimised in the process and can be exported from the software.However, these parameters correspond to the frame camera.To retrieve corrected parameters for the IMU, two additional steps must be completed: the interpolation of the corrected parameters, since the acquisition frequency of the frame camera is much lower than the one of the navigation system; and the computation of the boresight between the frame camera and the IMU.Our solutions to both of these problems are described in [22].However, the boresight from frame camera to IMU can be also be estimated using recently developed methods involving the use of raw inertial measurements [23].Alongside the optimisation, the bundle adjustment allows to build and export an orthophoto and a DEM (Figure 3).A three-bands image is then created by applying Equation (1) to the three bands of the frame camera.Images of both cameras are then converted to greyscale for the following steps of the algorithm.

Bundle Adjustment of Frame Images
In the first stage of the geocorrection, the information available in frame imagery is used to refine the estimates of the EOP given by the navigation sensors.Using the commercial software Agisoft Photoscan, we perform a bundle adjustment of frame images (Figure 3).Each image is associated with a set of EOP, initialised with the values output by the GPS+IMU system.Photoscan handles the stitching of the RGB images by finding tie points; EOP are optimised in the process and can be exported from the software.However, these parameters correspond to the frame camera.To retrieve corrected parameters for the IMU, two additional steps must be completed: the interpolation of the corrected parameters, since the acquisition frequency of the frame camera is much lower than the one of the navigation system; and the computation of the boresight between the frame camera and the IMU.Our solutions to both of these problems are described in [22].However, the boresight from frame camera to IMU can be also be estimated using recently developed methods involving the use of raw inertial measurements [23].Alongside the optimisation, the bundle adjustment allows to build and export an orthophoto and a DEM (Figure 3).The DEM generated was compared to co-localised data from the Shuttle Radar Topography Mission (SRTM).Since the DEM produced had a sub-meter resolution, we down-sampled it to fit the resolution of the SRTM data.The result was a 34 × 45 tiles DEM in the region delimited by the latitudes 52.164 • N, 52.176 • N and the longitudes 106.385 • E, 106.410 • E. The RMSE between this DEM and the SRTM DEM was 7.24 m, which falls in the typical range of errors of the SRTM [24].

Initial Ortho-Projection of the Scan Lines
With the set of optimised orientation parameters for the navigation system derived in Section 2.2, we create an orthomosaic composed with the scan lines collected by the pushbroom sensor.From the bundle adjustment, we already have a georeferenced orthophoto and the corresponding DEM; the orthomosaic is obtained by projecting the scan lines on the DEM using the collinearity equation.Let us call R enu ned the rotation matrix from the North-East-Down (NED) local-level frame (l-frame) to the East-North-Up (ENU) l-frame, both centred at the optical centre of the pushbroom camera.Let R ned I MU be the rotation matrix from the IMU frame to the NED l-frame, and R I MU pb the rotation matrix corresponding to the boresight from the pushbroom camera to the IMU.At this stage, R ned I MU is given by the estimates of the roll, pitch and yaw obtained in Section 2.2, and noted Rned I MU .Initially, the boresight The DEM generated was compared to co-localised data from the Shuttle Radar Topography Mission (SRTM).Since the DEM produced had a sub-meter resolution, we down-sampled it to fit the resolution of the SRTM data.The result was a 34 × 45 tiles DEM in the region delimited by the latitudes 52.164 • N, 52.176 • N and the longitudes 106.385 • E, 106.410 • E. The RMSE between this DEM and the SRTM DEM was 7.24 m, which falls in the typical range of errors of the SRTM [24].

Initial Ortho-Projection of the Scan Lines
With the set of optimised orientation parameters for the navigation system derived in Section 2.2, we create an orthomosaic composed with the scan lines collected by the pushbroom sensor.From the bundle adjustment, we already have a georeferenced orthophoto and the corresponding DEM; the orthomosaic is obtained by projecting the scan lines on the DEM using the collinearity equation.Let us call R enu ned the rotation matrix from the North-East-Down (NED) local-level frame (l-frame) to the East-North-Up (ENU) l-frame, both centred at the optical centre of the pushbroom camera.Let R ned I MU be the rotation matrix from the IMU frame to the NED l-frame, and R I MU pb the rotation matrix corresponding to the boresight from the pushbroom camera to the IMU.At this stage, R ned I MU is given by the estimates of the roll, pitch and yaw obtained in Section 2.2, and noted Rned I MU .Initially, the boresight is unknown and supposed to be negligible, hence RIMU pb is the identity matrix.The total rotation from the pushbroom camera to the ENU l-frame is given by Equation (2).
We can georeference the pushbroom data according to the collinearity equation Equation (3).
In Equation ( 3), µ corresponds to the collinearity coefficient, v is the metric coordinate of the pixel of interest in the body frame of the pushbroom camera.u pp , v pp are the coordinates of the principal point of the camera in the same frame, f is the focal length of the camera, (X G , Y G , Z G ) are the coordinates of the corresponding ground point in the ENU l-frame and (X oC , Y oC , Z oC ) are the coordinates of the optical centre in the same frame.Considering the radial distortion parameters K 1 , K 2 , K 3 . . . the decentring coefficients P 1 and P 2 , u = 0 − u pp and v = v − v pp , Brown's model for lens distortion [25] gives the expressions of δu and δv (Equation ( 4) and ( 5)).
Equation ( 3) gives the most generic form of the collinearity equation for our problem.However, at this stage, R enu pb = Renu pb , the position of the optical centre is the one output by the bundle adjustment, the optical centre's coordinates and the distortion parameters are unknown and the focal length is known with limited precision.Therefore, X oC = XoC , Y oC = YoC , Z oC = ZoC , ůpp = vpp = 0, the distortion parameters are all set to zero, f = f ( f = 0.012m in our case) and the collinearity equation turns into Equation (6).
We use ray casting to find the ground coordinates (X G , Y G , Z G ).The intersection of the rays with the DEM is computed using the following algorithm [26]: from the optical centre of coordinates ( XoC , YoC , ZoC ), trace the ray driven by the vector Rtotal × (0, v, f ) T .The candidate 2D pixels are the pixels that are crossed by the ray, and where the ray is between the lowest and the highest altitude of the DEM.The projection pixel is the first pixel below which the ray passes.This process is applied to all the pixels of all the scan lines collected, and the resulting ground pixels form an image that is superimposed on the reference mosaic.Since the resolution of our reference mosaic is much higher than the one of our pushbroom camera, the image produced is sparse, with important gaps between the projected points.Around each of these pixels, we leave a marker of size n × n (where n is the number of pixels; we choose n = 5 here) to compute a dilated footprint of the pushbroom projection.Inside this footprint, the isolated pixel values projected using the collinearity equation are interpolated with a 2D bilinear interpolation to produce the final projected image.An example of such image can be seen on Figure 4.

Systematic Error Correction
As can be seen on Figure 4b, the georeferencing is far from perfect, as the co-registration between the reference orthophoto and the scan lines exhibit significant discrepancies.However, the error made seem quite coherent across the image, suggesting that is mainly comes from systematic errors: • Bad intrinsic camera calibration (inaccurate focal length, principal point coordinates, or lens distortion parameters).• Non-negligible boresight between the IMU and the pushbroom camera.
To compensate for these errors, we proceed in two steps: we first identify tie points between the reference and the pushbroom mosaic, and then use these tie points as observations to adjust the IOP and the boresight parameters using a least-squares method.

Matching Points with SURF
We use the Speeded Up Robust Features (SURF) [27] to find corresponding points between the RGB reference and the pushbroom mosaic.SURF is a powerful tool for point matching that determines points of interest from multiple Gaussian-filtered version of the image, and matches them according to local-orientation descriptors computed with Haar wavelets.Provided that our images are radiometrically similar (which was discussed in Section 2.1), SURF is very suitable to find matching points between our reference orthophoto and the scan lines mosaic.The only obstacle is the use of bilinear interpolation when creating the mosaic: SURF relies on properties of the neighbourhoods of the points of interest, which are synthetic data in the case of an interpolation.However, as discussed further in the results section (Section 3), this was a problem only when the aircraft's motion was significant, in which case the interpolation had to fill important gaps between scan lines.
The SURF algorithm is applied to our reference orthophoto and the current pushbroom mosaic.The pairs of points chosen are filtered in two ways: first, the points too close to the border of the footprint of the mosaic are removed.Indeed, no-data (dark) points outside of the footprint might influence their neighbourhoods.Second, absurd matches found by SURF must be discarded.Although methods could be considered at this stage to discard these outliers, they will be dealt with at the least squares adjustment step.
The matches for the same flight are shown on Figure 5b.

Systematic Error Correction
As can be seen on Figure 4b, the georeferencing is far from perfect, as the co-registration between the reference orthophoto and the scan lines exhibit significant discrepancies.However, the error made seem quite coherent across the image, suggesting that is mainly comes from systematic errors:

•
Bad intrinsic camera calibration (inaccurate focal length, principal point coordinates, or lens distortion parameters).
• Non-negligible boresight between the IMU and the pushbroom camera.
To compensate for these errors, we proceed in two steps: we first identify tie points between the reference and the pushbroom mosaic, and then use these tie points as observations to adjust the IOP and the boresight parameters using a least-squares method.

Matching Points with SURF
We use the Speeded Up Robust Features (SURF) [27] to find corresponding points between the RGB reference and the pushbroom mosaic.SURF is a powerful tool for point matching that determines points of interest from multiple Gaussian-filtered version of the image, and matches them according to local-orientation descriptors computed with Haar wavelets.Provided that our images are radiometrically similar (which was discussed in Section 2.1), SURF is very suitable to find matching points between our reference orthophoto and the scan lines mosaic.The only obstacle is the use of bilinear interpolation when creating the mosaic: SURF relies on properties of the neighbourhoods of the points of interest, which are synthetic data in the case of an interpolation.However, as discussed further in the results section (Section 3), this was a problem only when the aircraft's motion was significant, in which case the interpolation had to fill important gaps between scan lines.
The SURF algorithm is applied to our reference orthophoto and the current pushbroom mosaic.The pairs of points chosen are filtered in two ways: first, the points too close to the border of the footprint of the mosaic are removed.Indeed, no-data (dark) points outside of the footprint might influence their neighbourhoods.Second, absurd matches found by SURF must be discarded.Although methods could be considered at this stage to discard these outliers, they will be dealt with at the least squares adjustment step.
The matches for the same flight are shown on Figure 5b.

Interior Orientation and Boresight Estimation
The matched points act as measures that can be used to perform a least squares adjustment and estimate the IOP and the boresight.Since the mosaic of scan lines is derived from an interpolation,

Interior Orientation and Boresight Estimation
The matched points act as measures that can be used to perform a least squares adjustment and estimate the IOP and the boresight.Since the mosaic of scan lines is derived from an interpolation, not every point matched corresponds to an actual pixel from the pushbroom scanner; for each point matched, we simply compute the closest point which did not derive from the interpolation, and get the index of its scan line l as well as the index k of its pixel in the scan line.We wish to find the best parameters so that Equation (3) is verified.Ideally, the functions g k,l,1 , g k,l,2 and g k,l,3 given by Equation ( 7) should equal zero for all matches.   g k,l,1 (r b , p b , y b , f , K 1 , K 2 , P 1 , P 2 ) g k,l,2 (r b , p b , y b , f , K 1 , K 2 , P 1 , P 2 ) g k,l,3 (r b , p b , y b , f , K 1 , K 2 , P 1 , P 2 ) The boresight roll, pitch and yaw are noted respectively r b , p b and y b in Equation ( 7).The planar coordinates of each corrected ground point, (X G k,l , Y G k,l ) are the ones output by the SURF; Z G k,l is the value of the DEM at coordinates (X G k,l , Y G k,l ).From the third line of Equation ( 7), µ k,l can be computed, leaving the first two lines as two constraints that ideal parameters should satisfy.The input system of our least squares optimisation is given by Equation (8).
In Equation ( 8), N M is the number of matches and the indexes k are not necessarily distinct, neither are the indexes j, as matches can occur in the same scan line, or for the same pixel in different scan lines.Consider A to be the Jacobian matrix of g relatively to the eight parameters to optimise, and v = − g where g = g( rb , pb , ẙb , f , K1 , K2 , P1 , P2 ) is the vector obtained with set initial parameters (Equation ( 9)).
Then the increment of the parameters is given by A T A −1 A T v, according to the theory of least squares optimisation.It should be noted that we do not estimate the coordinates of the optical centre.Ideally, using as parameters all the IOP (focal length, principal point, distortion parameters) and the three boresight parameters, computing the least squares adjustment would bring corrected values for all these parameters.However, this is not possible, as the position of the optical centre and the couple (boresight roll, boresight pitch) are strongly correlated and prevent the algorithm from converging.Indeed, in a linearised model, a discrepancy in roll results in a shift of the ground points orthogonally to the direction of the vehicle, and so does an error on the v pp .A similar correlation exists between the boresight pitch and the second coordinate of the optical centre.Consequently, we have studied all the existing correlations between compensated parameters, for the images and tie points given in Figure 5b, in three different optimisation scenarios.Significant correlations were reported in Table 1.From the correlations given in Table 1, we have decided to optimise the boresight, f, K 1 , K 2 , P 1 and P 2 .It is to be noted that, due to the one-dimensional nature of pushbroom sensors (u = 0), P 1 and P 2 are significantly correlated to respectively the boresight pitch and the boresight roll.We have not observed convergence problem when optimising P 1 , P 2 and the rest of the parameters, however, such problems might occur when the tie points found all have similar v coordinates.
In order to discard outliers output by SURF, we compute the standard deviation σ( ǧ) of the residuals after compensation, ǧ = g( řb , pb , yb , f , Ǩ1 , Ǩ2 , P1 , P2 ).We discard any matched pixel k, l for which ǧk,l > 3σ( ǧ), and compute the least squares adjustement with the remaining matches.Figure 6 shows an example of outliers removal for the flight of Selenga village.Then the increment of the parameters is given by (A T A) −1 A T v, according to the theory of least squares optimisation.It should be noted that we do not estimate the coordinates of the optical centre.Ideally, using as parameters all the IOP (focal length, principal point, distortion parameters) and the three boresight parameters, computing the least squares adjustment would bring corrected values for all these parameters.However, this is not possible, as the position of the optical centre and the couple (boresight roll, boresight pitch) are strongly correlated and prevent the algorithm from converging.Indeed, in a linearised model, a discrepancy in roll results in a shift of the ground points orthogonally to the direction of the vehicle, and so does an error on the v pp .A similar correlation exists between the boresight pitch and the second coordinate of the optical centre.Consequently, we have studied all the existing correlations between compensated parameters, for the images and tie points given in Figure 5b, in three different optimisation scenarios.Significant correlations were reported in Table 1.From the correlations given in Table 1, we have decided to optimise the boresight, f , K 1 , K 2 , P 1 and P 2 .It is to be noted that, due to the one-dimensional nature of pushbroom sensors (u = 0), P 1 and P 2 are significantly correlated to respectively the boresight pitch and the boresight roll.We have not observed convergence problem when optimising P 1 , P 2 and the rest of the parameters, however, such problems might occur when the tie points found all have similar v coordinates.
In order to discard outliers output by SURF, we compute the standard deviation σ( ǧ) of the residuals after compensation, ǧ = g( řb , pb , yb , f , Ǩ1 , Ǩ2 , P1 , P2 ).We discard any matched pixel k, l for which ǧk,l > 3σ( ǧ), and compute the least squares adjustement with the remaining matches.Figure 6 shows an example of outliers removal for the flight of Selenga village.The uncertainty on the estimated values of the boresight and IOP need to be known for a second least squares adjustment used later in our algorithm (Section 2.6).They are given as the square root of the diagonal elements of A T A −1 .In Table 2, we have reported the values and standard deviations obtained for our Selenga village flight.
The values obtained for P 1 and P 2 are high: since the principal point coordinates cannot be estimated at this step, the error made by setting them to zero are compensated by other parameters.In particular, since u = 0 and v = v, δu = P 1 r 2 and the tangential distortion term of δv is limited to P 2 r 2 + 2v 2 .As a result, the error made on v pp propagates mainly to the roll and P 2 , and the error made on u pp propagates mainly to the pitch and P 1 .The optimised boresight and IOP are used in the projection process described in Section 2.3, resulting in a new mosaic that fits the reference orthophoto better (see Figure 5c).

Geocorrection Using Particle Image Velocimetry
In this section, we introduce the concept of Particle Image Velocimetry (PIV) and explain how we use it to perform geocorrection and retrieve better estimates of the orientation parameters.

PIV Theory
Particle Image Velocimetry is a technique primarily designed to quantify the movements of fluids [28].It is an image processing algorithm that computes local instantaneous velocities between two images.The algorithm proceeds in two steps:

•
Split the two images into a grid of cells, which sizes are given by the user.

•
Find the new location of each cell of the first image into the second image, by finding the maximum of the cross-correlation of the cell (normalised in mean and standard deviation) and its neighbourhood in the second image (also normalised).
Two sizes have to be input: the size of the cells in the first image, and the size of the interrogation areas in the second image.Interrogation areas are sub-images of bigger size than the cells of the first image, in which the algorithm will try to match the cell.Considering a cell as a 2D function c(x, y) from one image, and the corresponding interrogation area i(x, y) in the second image, the estimated position of c in i is the 2D vector (a, b) that maximises the cross-correlation between c and the sub-image found following the vector (a, b) from the origin of i (see Equation (10)).
In Equation (10), n refers to the number of pixels in the cell c, c and σ c are respectively the mean and the standard deviation of the intensity inside c, i jk and σ i jk are the mean and the standard image of the subset of the interrogation area shifted by (j, k).In practice, Equation ( 10) is computed in the Fourier domain (where the calculation is more efficient).The 2D Fourier transform of the cells/interrogation areas is performed with the Fast Fourier Transform (FFT) [29].
The result is a grid of 2D displacements vectors corresponding to the estimated movements for each cell.For all the PIV processing tasks described in the following sections, we use an open-source Matlab application, called PIVlab, by Thielicke & Stamhuis [30].This application includes the possibility of filtering the results by deleting the vectors which norms are higher by a threshold proportional to their standard deviation.Vectors can also be manually deleted, or estimated by interpolation of neighbouring vectors when the algorithm failed to find one, or when the vector was filtered.

Application: Elastic Deformation
We use PIV to compute local distortions from the current mosaic to the reference orthophoto.PIV only estimates local translations, and does not deal with rotations of rescaling transformations, however, at this stage, the mosaic and the orthophoto should match well enough so that the differences between the two images can be approximated be a set of local shifts.
At this stage, we have noticed that residual radiometric discrepancies between the two sensors impact the co-registration significantly.To mitigate these effects, we have studied the possibility of using gradient images.Let us call Gr x and Gr y the gradients of an image of interest along its x and y axis: we use the gradient magnitude, Gr = Gr 2 x + Gr 2 y ; we have let an algorithm select 50 random patterns of fixed size (30 × 30 pixels) from the mosaic computed in the previous steps and shown on Figure 5c.The 2D unbiased, normalised cross-correlation (which is the criterion maximised by the PIV algorithm) of each pattern with a 90 × 90 pixels pattern from the reference orthophoto was computed both for the original patterns, and for their gradients.We have plotted on Figure 7 the average cross-correlations along the vertical axis (the results along the horizontal axis were similar) for the images and the gradients.image of the subset of the interrogation area shifted by (j, k).In practice, Equation ( 10) is computed in the Fourier domain (where the calculation is more efficient).The 2D Fourier transform of the cells/interrogation areas is performed with the Fast Fourier Transform (FFT) [29].
The result is a grid of 2D displacements vectors corresponding to the estimated movements for each cell.For all the PIV processing tasks described in the following sections, we use an open-source Matlab application, called PIVlab, by Thielicke & Stamhuis [30].This application includes the possibility of filtering the results by deleting the vectors which norms are higher by a threshold proportional to their standard deviation.Vectors can also be manually deleted, or estimated by interpolation of neighbouring vectors when the algorithm failed to find one, or when the vector was filtered.

Application: Elastic Deformation
We use PIV to compute local distortions from the current mosaic to the reference orthophoto.PIV only estimates local translations, and does not deal with rotations of rescaling transformations, however, at this stage, the mosaic and the orthophoto should match well enough so that the differences between the two images can be approximated be a set of local shifts.
At this stage, we have noticed that residual radiometric discrepancies between the two sensors impact the co-registration significantly.To mitigate these effects, we have studied the possibility of using gradient images.Let us call Gr x and Gr y the gradients of an image of interest along its x and y axis: we use the gradient magnitude, Gr = Gr 2 x + Gr 2 y ; we have let an algorithm select 50 random patterns of fixed size (30 × 30 pixels) from the mosaic computed in the previous steps and shown on Figure 5c.The 2D unbiased, normalised cross-correlation (which is the criterion maximised by the PIV algorithm) of each pattern with a 90 × 90 pixels pattern from the reference orthophoto was computed both for the original patterns, and for their gradients.We have plotted on Figure 7 the average cross-correlations along the vertical axis (the results along the horizontal axis were similar) for the images and the gradients.As can be seen on Figure 7, the use of PIV on original images suggests that they are well co-registered with no further shift, while PIV applied to gradients images indicates that the best match is for an average shift of eight pixels up.Consequently, we proceed with gradient images; PIV is computed with the following parameters: the cell size is 128 × 128 pixels, the interrogation area size is 256 × 256 pixels, and the output vector field is filtered to contain only vectors with norms within the range [µ − 0.5σ, µ + 0.5σ], where µ and σ are, respectively, the mean and the standard deviation of the norm of vectors output by the PIV algorithm.The resulting vector field is sparse compared to the resolution of the mosaic, hence we interpolate the vector field so that a shift is assigned to every pixel of the mosaic.By translating every pixel following its assigned shift, a new mosaic is created (see Figure 5e).

Estimation of the Orientation Parameters
PIV results are considered as tie points: for each pixel projected on a ground point, corresponds a corrected projection indicated by PIV.Using this information, we can perform a bundle adjustment to retrieve the corrected exterior orientation parameters for each scan line.Considering N L and N P to be respectively the number of scan lines and the number of pixels per line, (X oC i , Y oC i , Z oC i ) and (r i , p i , y i ) As can be seen on Figure 7, the use of PIV on original images suggests that they are well co-registered with no further shift, while PIV applied to gradients images indicates that the best match is for an average shift of eight pixels up.Consequently, we proceed with gradient images; PIV is computed with the following parameters: the cell size is 128 × 128 pixels, the interrogation area size is 256 × 256 pixels, and the output vector field is filtered to contain only vectors with norms within the range [µ − 0.5σ, µ + 0.5σ], where µ and σ are, respectively, the mean and the standard deviation of the norm of vectors output by the PIV algorithm.The resulting vector field is sparse compared to the resolution of the mosaic, hence we interpolate the vector field so that a shift is assigned to every pixel of the mosaic.By translating every pixel following its assigned shift, a new mosaic is created (see Figure 5e).

Estimation of the Orientation Parameters
PIV results are considered as tie points: for each pixel projected on a ground point, corresponds a corrected projection indicated by PIV.Using this information, we can perform a bundle adjustment to retrieve the corrected exterior orientation parameters for each scan line.Considering N L and N P to be respectively the number of scan lines and the number of pixels per line, X oC i , Y oC i , Z oC i and (r i , p i , y i ) the EOP for line i, (X G i,j , Y i,j , Z G i,j ) the real ground point for pixel j of line i and µ i,j the collinearity coefficient for pixel j of line i, we wish to minimise the squares of the set of Equation (11).
The first line in the set of Equation ( 11) corresponds to the collinearity equation Equation ( 3), and provides two constraints (three, one of which is used to determine µ i,j ) per pixel.Therefore, collinearity equations provide 2 × N L × N P constraints.The rest of the constraints ensure that the final estimated parameters stay close to their initial estimates (noted with a circle) obtained either in Section 2.2 or Section 2.4.2 (some IOP were estimated using SURF matching and were noted with a "SURF" subscript).There are six EOP per line and seven IOP, in total, the least-square algorithm estimates 6 × N L + 7 parameters from 2 × N L × N P + 6 × N L + 7 constraints.Calling A PIV the Jacobian matrix of the system of Equations (11) and vPIV the opposite of its residuals, the least squares adjustment theory increments the parameters at each iteration by A T PIV WA PIV −1 A T PIV vPIV .Unlike the adjustment performed using SURF points, a weight matrix W must be used in this second adjustment, as there are different types of observations.W is a diagonal matrix; its first 2 × N L × N P elements are related to the expected error of the matching operated by PIV.We consider the ground points output by the PIV to have a standard deviation of 0.5 m , resulting in a weight 4 m −2 for the first 2 × N L × N P constraints (Equation ( 12)).
The uncertainties of the position and attitude parameters for each line are given by the manufacturer of the GPS+IMU system: 2 cm for position measurements, 0.02 • for roll and pitch angles, and 0.05 • for yaw angles (Equation ( 13 corresponding to step of our algorithm (first projection, after boresight/IOP calibration, after PIV and after estimation of the EOP).The results are given in Table 4.To provide quantitative results, we have determined the planar Root Mean Square Error (RMSE) on samples of 50 points for every test area.To provide the most objective statistics possible, these points were selected as follows: a series of random patterns of given size, extracted for the image of interest, was generated; inside each pattern, a point was chosen in the orthophoto and the four images corresponding to each step of our algorithm (first projection, after boresight/IOP calibration, after PIV and estimation of the EOP).The results are given in Table 4.For the first three areas, the RMSE decreased from about 25 m (before correction) to about one metre (after correction).While many tie points were found by SURF for Selenga Village 1 and Selenga Rivers, few were found for Selenga Village 2. The reason is the U-turn motion of the aircraft implied important gaps between the scan lines; to create the mosaic, data was interpolated between these scan lines, and no salient point recognisable by SURF can be detected in this synthetic data.However, in spite of the lesser decrease of the RMSE at the SURF stage (13.6 m, against 2.4 m and 1.3 m for Village 1 and Rivers, respectively), PIV performs very well and the final RMSE is approximately the same for the three areas.This result suggests that local normalised cross-correlation can deal with significant rotations like the ones that affected the Village 2 image.The final RMSE in the Gremyachinsk area (6.5 m), although seemingly worse than the others, is encouraging as the initial RMSE was significantly higher (45.5 m) and the image does not exhibit a significant heterogeneity; only 11 points were matched by SURF, all gathered on the few built structures at the bottom centre of the image.The application of PIV then allowed to decrease the RMSE by an additional 26 %.On the shore of Lake Geneva, significant discrepancies were observed in the first projection, where adjacent scan lines appeared to be taken with different attitudes and, as a consequence, the correction operated by PIV, although satisfying, did not respect the inherent geometrical properties of the pushbroom scanner: the shifts estimated by PIV showed very different values and directions within each single scan line.As a consequence, and for this test area only, the mosaic produced after correction of the EOP was better (RMSE = 1.6 m) than the one produced by PIV (RMSE = 1.8 m).It should be noted that the opposite is expected, and observed for the four other test areas, as the final mosaic is produced by an orthorectification algorithm, while PIV produces a mosaic regardless of the geometrical nature of the problem.Overall, the geocorrection algorithm performs well on areas with enough salient points, whether the motion of the aircraft is smooth (Selenga Village 1, Selenga Rivers) or irregular (Selenga Village 2, Lake Geneva Shore).

Conclusions
We presented here a post-processing algorithm for dual frame-pushbroom airborne acquisitions which estimates orientation parameters to co-register the orthophoto from frame images and the mosaic from pushbroom scan lines.The algorithm is split in several steps, including feature matching to compensate for systematic errors in the co-registration, local area matching, and estimation of both interior and exterior orientation parameters.Area matching is performed using local normalised cross-correlation in the form of Particle Image Velocimetry, whose use is new to the domain of remote sensing.Without GCP and any a priori model of the trajectory of the aircraft, the RMSE of the pushbroom scan lines with the reference orthophoto is reduced by a factor from 7 (for homogeneous areas) to 30 (heteogeneous areas and stable attitude of the aircraft).Such results were obtained on a variety of images depicting different of surfaces, acquired at different altitudes and for different types of trajectory of the aircraft.The geocorrected data has already been used for various applications including the study of heavy metals and the classification of vegetation in the Selenga delta.In the future, the Leman-Baikal project shall also make use of this data to investigate the presence of Chlorophyll-a and the turbidity in both Lake Geneva and Lake Baikal.

Figure 1 .
Figure 1.(a) Illustrates the setting for our airborne acquisitions.(b) Represents the different frame coordinates for the three sensors: the frame camera, the pushbroom (PB) camera, and the IMU .

Figure 1 .
Figure 1.(a) Illustrates the setting for our airborne acquisitions.(b) Represents the different frame coordinates for the three sensors: the frame camera, the pushbroom (PB) camera, and the IMU.

Figure 3 .
Figure 3. Processing of the frame images using Agisoft Photoscan: (a) alignment of frames and computation of the orthophoto; and (b) computation of the DEM of the area (Selenga Delta Village).

Figure 3 .
Figure 3. Processing of the frame images using Agisoft Photoscan: (a) alignment of frames and computation of the orthophoto; and (b) computation of the DEM of the area (Selenga Delta Village).

Figure 4 .
Figure 4. Orthoprojection of the scan lines (cyan) on top of the reference orthomosaic (red) above a village near the Selenga Delta of Lake Baikal.(a) is the orthoprojection of the pushbroom pixels (without interpolation) and (b) is the image as seen after bilinear interpolation of the projected values.

Figure 4 .
Figure 4. Orthoprojection of the scan lines (cyan) on top of the reference orthomosaic (red) above a village near the Selenga Delta of Lake Baikal.(a) is the orthoprojection of the pushbroom pixels (without interpolation) and (b) is the image as seen after bilinear interpolation of the projected values.

( f )
Projection after correction of the EOP.

Figure 5 .
Figure 5. Illustrations of the various steps of our algorithm for a test flight over a village of the Selenga Delta (Russia, 17 August 2014).All images show the superimposed scan lines (in cyan) on top of the reference orthophoto (in red).

Figure 5 .
Figure 5. Illustrations of the various steps of our algorithm for a test flight over a village of the Selenga Delta (Russia, 17 August 2014).All images show the superimposed scan lines (in cyan) on top of the reference orthophoto (in red).

Figure 6 .
Figure 6.Scatter of the difference vectors for the pairs of points matched by the SURF , for a flight over a village of the Selenga Delta.All the matches are represented by red crosses.Tilted blue crosses correspond to the pairs kept after removing outliers.

Figure 6 .
Figure 6.Scatter of the difference vectors for the pairs of points matched by the SURF , for a flight over a village of the Selenga Delta.All the matches are represented by red crosses.Tilted blue crosses correspond to the pairs kept removing outliers.

Figure 7 .
Figure 7. Averages of the cross-correlations of corresponding patterns between a reference orthophoto and a co-registered mosaic, for the patterns and their gradients.

Figure 7 .Figure 8 .
Figure 7. Averages of the cross-correlations of corresponding patterns between a reference orthophoto and a co-registered mosaic, for the patterns and their gradients.

Figure 8 .
Figure 8. Examples of co-registered patterns using PIV on original images (a-c) and on gradient images (d-f).

Figure 10 .
Figure 10.Superposition of the reference orthophotos and the mosaic produced with the scan lines: (a,c) before geocorrection and (b,d) after geocorrection.(a,b): Gremyachinsk; (c,d): shore of Lake Geneva.

Table 1 .
Maximum correlations of compensated parameters observed in different scenarios at first iteration of the least squares optimisation.

Table 1 .
Maximum correlations of compensated parameters observed in different scenarios at first iteration of the least squares optimisation.

Table 2 .
Values and standard deviations of boresight parameters and IOP estimated by the least squares adjustment.

Table 4 .
Planar RMSE for each test area, and percentage evolution from previous step, at each step of the geocorrection.