Aerial Hybrid Adjustment of LiDAR Point Clouds, Frame Images, and Linear Pushbroom Images

Jonassen, Vetle O.; Kjørsvik, Narve S.; Blankenberg, Leif Erik; Gjevestad, Jon Glenn Omholt

doi:10.3390/rs16173179

Open AccessArticle

Aerial Hybrid Adjustment of LiDAR Point Clouds, Frame Images, and Linear Pushbroom Images

by

Vetle O. Jonassen

^1,2,*

,

Narve S. Kjørsvik

²,

Leif Erik Blankenberg

² and

Jon Glenn Omholt Gjevestad

¹

Department of Geomatics, Norwegian University of Life Sciences, P.O. Box 5003 NMBU, 1432 Ås, Norway

²

Field Geospatial AS, P.O. Box 185 Skøyen, 0212 Oslo, Norway

^*

Author to whom correspondence should be addressed.

Remote Sens. 2024, 16(17), 3179; https://doi.org/10.3390/rs16173179

Submission received: 12 July 2024 / Revised: 19 August 2024 / Accepted: 27 August 2024 / Published: 28 August 2024

(This article belongs to the Topic Multi-Sensor Integrated Navigation Systems)

Download

Browse Figures

Versions Notes

Abstract

In airborne surveying, light detection and ranging (LiDAR) strip adjustment and image bundle adjustment are customarily performed as separate processes. The bundle adjustment is usually conducted from frame images, while using linear pushbroom (LP) images in the bundle adjustment has been historically challenging due to the limited number of observations available to estimate the exterior image orientations. However, data from these three sensors conceptually provide information to estimate the same trajectory corrections, which is favorable for solving the problems of image depth estimation or the planimetric correction of LiDAR point clouds. Thus, our purpose with the presented study is to jointly estimate corrections to the trajectory and interior sensor states in a scalable hybrid adjustment between 3D LiDAR point clouds, 2D frame images, and 1D LP images. Trajectory preprocessing is performed before the low-frequency corrections are estimated for certain time steps in the following adjustment using cubic spline interpolation. Furthermore, the voxelization of the LiDAR data is used to robustly and efficiently form LiDAR observations and hybrid observations between the image tie-points and the LiDAR point cloud to be used in the adjustment. The method is successfully demonstrated with an experiment, showing the joint adjustment of data from the three different sensors using the same trajectory correction model with spline interpolation of the trajectory corrections. The results show that the choice of the trajectory segmentation time step is not critical. Furthermore, photogrammetric sub-pixel planimetric accuracy is achieved, and height accuracy on the order of mm is achieved for the LiDAR point cloud. This is the first time these three types of sensors with fundamentally different acquisition techniques have been integrated. The suggested methodology presents a joint adjustment of all sensor observations and lays the foundation for including additional sensors for kinematic mapping in the future.

Keywords:

hybrid adjustment; time segmentation; bundle adjustment; LiDAR; pushbroom; hyperspectral; photogrammetry; strip adjustment

1. Introduction

With the rapid development of sensor technology, multisensor mapping platforms are becoming increasingly popular in geospatial data acquisitions. Light detection and ranging (LiDAR) scanners and cameras are integrated on common platforms (i.e., rigid installations), where the relative orientation and position between the instruments are constant within the platform frame. The increased use of multisensor mapping platforms has been facilitated by the smaller sensor form factor, lower cost, and increased accuracy. Furthermore, this has enabled data fusion of different complementary data sources. Weight limitations inherent to unmanned aerial vehicles (UAVs) used to restrain their payloads in the early days of UAV development. At present, UAVs built for mapping purposes are commonly equipped with several sensors such as LiDAR scanners, cameras, and global navigation satellite system (GNSS)-aided inertial navigation systems (INS), e.g., [1,2,3]. Hyperspectral imaging (HSI) cameras have also been integrated on GNSS/INS-supported UAVs together with LiDAR scanners, e.g., [4], and similar set-ups are used for mapping from helicopters and airplanes, e.g., [5,6].

The data from the different sensors have different strengths that are utilized in the data analysis. One example is the unique potential for detailed radiometric analyses using HSI. Some examples of the applications of HSI include geological mapping, e.g., [7,8], forestry, e.g., [9,10], urban classification, e.g., [6,11], and agriculture, e.g., [12]. HSI cameras are often LP cameras, as one image dimension is used for signal registration in multiple spectral bands. After data acquisition, the appropriate geospatial accuracy of the data from the different sensors is traditionally obtained and documented from fundamentally different processing pipelines. Geometric errors may introduce both systematic and random errors in the data to be analyzed. Enhanced geometric accuracy thus increases the value of the acquired data and allows more advanced analysis techniques to be applied, including temporal studies from recurring data acquisitions. Images are commonly aligned in the bundle adjustment, while LiDAR point clouds are separately matched in a strip adjustment process to provide accurate estimates of the interior and exterior parameters of the respective sensors. Bundle adjustment traditionally discretizes the trajectory parameters as the exterior orientations of the camera per image exposure, while the LiDAR strip adjustment discretizes the trajectory corrections in temporally longer strips. However, if mounted on the same platform, the observations from these sensors conceptually provide information for estimating common trajectory parameters (i.e., the GNSS/INS position and orientation errors). Due to the sensor-specific trajectory error parameterization, the common trajectory parameters cannot be readily estimated based on observation residuals from the different functional models.

The operating principles of frame cameras, linear pushbroom (LP) cameras, and LiDAR scanners are shown in Figure 1. The passive cameras store radiometric information in image pixels. Forward overlap, commonly seen in photogrammetric blocks from frame images, is not feasible to achieve with LP image lines without unwanted distortion. The active LiDAR scanner measures the range and angle of the emitted pulses. Based on the scanner-specific configuration and technology, forward overlap may also be achieved in LiDAR scanning. This could be accomplished, for example, using rotating mirrors or prisms.

Here, we present a novel theoretical development for integrating observations from 3D LiDAR point clouds, 2D images from frame cameras, and 1D image lines from LP cameras in a joint hybrid adjustment using one common trajectory correction model for all observation types. This modeling allows integrating the data from the three sensor types at the observation level such that the optimal matching solution from all the data is found. To our knowledge, this is the first demonstration of a joint adjustment involving these three sensor modalities using a rigorous and scalable trajectory formulation.

2. Related Work

The approach presented here builds on previously published work on the hybrid adjustment of frame images and LiDAR point clouds [13] and the bundle adjustment of LP HSI [14].

2.1. Trajectory Correction Modeling

A common practice for trajectory correction modeling when conducting LiDAR matching is the strip adjustment, where trajectory corrections are estimated per flight strip, e.g., [15,16]. However, in the bundle adjustment of images, the trajectory corrections are estimated as exterior camera orientations at the given times of the image exposures, e.g., [17,18]. This different parameterization makes it infeasible to use the sensor observations in a joint adjustment to estimate the common trajectory corrections.

A solution to the different parameterization has been presented, estimating the trajectory errors at the level of the inertial measurement unit (IMU) [19]. The continuous representation of the trajectory on the GNSS/INS level is a relatively new trend facilitated by advancements in algorithms and available computer processing power. The method has shown accurate results for smaller data sets, e.g., [20].

Deep learning has been used to find the transformations between different sensors see [21]. However, both LiDAR adjustment and bundle adjustment have known geometrical estimation problems with rigorously tested and well-known optimal solutions.

To overcome scaling issues, the trajectory can be preprocessed and split into segments with a given time step so that corrections are only estimated at specified intervals, e.g., [13,22]. This approach assumes the use of an IMU that captures the high-frequency motion. Thus, only the low-frequency initial trajectory errors remain and are accurately described by a low-order cubic spline model [22]. The model has provided state-of-the-art accurate results, e.g., [3,13,23].

2.2. Hybrid Adjustment

Several approaches for hybrid adjustment (i.e., a joint adjustment using observations from LiDAR scanners and cameras) have been suggested, e.g., [2,5,13,23,24]. The generalization of LiDAR points into spatial voxels has provided accurate, efficient, and scalable hybrid adjustment of images and LiDAR point clouds through using the measurements from both sensor types in the estimation of the same trajectory corrections [13]. The main advantages of voxelization, compared to the variants of the iterative closest point algorithm, are the immense data compression of the LiDAR point clouds, simple voxel classification to find suitable matching correspondences without individual point selection, and the easily obtained observation weights; see [13].

2.3. Proposed Approach

Here, we use the bundle adjustment of LP HSI from [14] to introduce the LP camera as a third sensor to the scalable hybrid adjustment method presented in [13].

The purpose of the presented work is to utilize observations from LiDAR point clouds and images, originating from both LP and frame cameras, in a joint hybrid adjustment. Data from these three sensors conceptually provide information to estimate the same trajectory parameters, as the data are spatiotemporally close. The objective is to estimate corrections to the interior and exterior orientations of the LiDAR scanner and the two cameras. A detailed list of parameters specific to the cameras and LiDAR scanners on kinematic mapping systems is listed in [13].

3. Methodology

Figure 2 shows the workflow where the method is divided into four simplified steps. Figure 3 shows the steps in more detail. More detailed explanations of the steps in Figure 3 can be found in [13,14]. Thus, only the basic underlying theory is covered in this section.

3.1. Preprocessing

The preprocessing steps are needed to retrieve the observations from frame images and LP image lines before the adjustment. Additionally, the platform trajectory is computed, and point clouds are formed from the direct georeferencing of LiDAR points.

3.1.1. GNSS/INS Processing

A Kalman filter/smoother is used to retrieve the trajectory of the platform using the software TerraPos; see [25]. The trajectory consists of high-rate positions and orientations for the platform, along with their full covariance matrices. To conduct the GNSS/INS processing as presented, an IMU must be rigidly mounted on each platform, and the lever arm between the GNSS antenna and an IMU is either known a priori or estimated using the Kalman filter. Typically, a gyro-stabilized mount serves as a platform within an aircraft.

3.1.2. Point Cloud Georeferencing

Once an initial trajectory has been computed via the Kalman filter, it is used for the direct georeferencing of the LiDAR measurements with the use of the a priori mounting parameters and raw LiDAR scanner measurements. The result of this is a georeferenced point cloud with spatial mismatches between the overlapping scans owing to scanner noise, errors in the estimated trajectory, and unknown errors in the interior components and mounting of the scanner.

3.1.3. Observation Retrieval from Frame Images

Image observation retrieval is based on finding salient key points in multiple images using a standard key-point detector [26]. Furthermore, the descriptors corresponding to the key points are matched to find the key-point correspondences likely to represent the same tie-point in object space. Here, the binary robust invariant scalable key points (BRISK: [27]) method is used for the efficient and accurate matching of key points.

Epipolar geometry constraints are used to filter the correspondences using random sample consensus (RANSAC: [28]). The epipolar geometry constraints for a pinhole camera relate the image coordinates in one image to those in another image. Usually, hundreds to thousands of point pairs exist between two overlapping frame images to provide an overdetermined system, and RANSAC can be used to effectively filter out the outlier key-point correspondences.

The remaining key-point correspondences after RANSAC filtering serve as image observations in the following adjustment.

3.1.4. Observation Retrieval from LP Image Lines

As with the observations used for the bundle adjustment of frame images, the LP image observations are key-point correspondences identified from multiple overlapping images that represent the same tie-point in object space. However, a problem with identifying key points in LP image lines arises, as key-point detectors and descriptors use local 2D image neighborhoods. As LP image lines consist only of a single pixel in one of the image dimensions, a local image neighborhood cannot directly be used to describe the key point in 2D. Thus, several consecutive LP image lines are stacked to form 2D LP image scenes. However, these LP image scenes are subject to significant image distortions, due to the relative camera rotation within an LP image scene (Figure 4a). Thus, a rotation compensation for this effect is conducted as presented in [14] (Figure 4b). The method effectively compensates for the rotations captured using the IMU for a small range of consecutive LP image lines to limit the image distortions in that neighborhood. After observation retrieval, the rotation compensation is no longer considered nor required. The bundle adjustment of LP image lines has shown sub-pixel planimetric accuracy without the use of other data sources [14].

Each LP image scene is relatively rotation-compensated using the preprocessed trajectory orientations (Figure 4). Furthermore, BRISK is used to detect and describe the key points within the rotation-compensated LP image scenes, and RANSAC is used for filtering, just as with the frame images. After RANSAC filtering, the observations are registered to the center of the pixel in their respective LP image lines. This center registration of observations makes the observations exact per respective LP image exposure; that is, the observations are independent of small errors in the local image scene neighborhood stemming from the approximate rotation compensation. The LP image observation precision is well-defined from the standard uniform distribution.

3.2. Initialization

The 3D tie-point positions in object space are estimated during the initialization step. The initial tie-point position is computed through finding the closest point of the intersection between the corresponding image rays from multiple images. Corrections to these positions are later estimated in the adjustment.

The second part of the initialization step is the trajectory segmentation. This segmentation is based on the assumption that high-frequency trajectory errors are captured using the INS, while only low-frequency trajectory errors remain after the initial GNSS/INS processing. The background for the trajectory correction model is described in [13]. To summarize, the trajectory corrections are estimated in the following adjustment only at certain time steps with constant intervals. The corrections at the time of the discrete observations are interpolated using a cubic spline when estimating the parameters in the iterative least squares estimation process.

3.3. Voxel Generation

Voxels are created from the iteratively corrected LiDAR point cloud to effectively include the LiDAR data in the adjustment. The voxel-based method has been shown to ensure the scalability independent of the LiDAR point-cloud density, the efficiency in terms of the processing time, and state-of-the-art accuracy [13]. No LiDAR points are stored in the voxels themselves, as only certain key properties are needed for each voxel. These metadata are the mean and covariance matrix of the positions, the mean time, and the relative number of intermediate returns of points within the voxel. The voxelization discretizes the LiDAR point cloud into voxels of equal size, such that LiDAR points within the same voxel originate from the same LiDAR scanner and are spatiotemporally close to each other. All points within a voxel stem from the same trajectory segment so that the trajectory corrections associated with a trajectory segment can be estimated from observations formed from the voxels.

The voxels are classified to remove vegetation and to provide meaningful matching between them. Voxels classified as vegetation are removed due to the potential temporal instability and geometric indistinctness in overlapping scans. Classes other than vegetation are used to match voxels of the same class to each other. The voxels are classified into three main groups (Figure 5):

Planes, based on the eigen transformation of their covariance matrices. Subclasses within the group are horizontal, vertical, and other inclined planes.
Vegetation, based on the number of intermediate returns within a voxel.
Unclassified, representing indistinct geometric neighborhoods not classified as any of the other classes.

All voxels need to consist of some tens of points to provide a sufficient statistical representation of the points within a voxel. Thus, voxels with relatively few points (e.g., <50 points) should be discarded. The method is robust to the exact choice of the voxel size, and finetuning this is not needed [13].

Voxel observations are created from the difference between the mean coordinates of two overlapping voxels. The observation is weighted with the inverse sum of the covariance matrices of the two voxels. These covariance matrices represent the spatial dispersion of points within a voxel. If the voxels represent planar surfaces, only the distance along the normal vector of the surface is minimized.

Hybrid observations are generated between the tie-points from images and the planar voxels from the LiDAR point clouds (Figure 6). These hybrid observations express the distance between the tie-point position and the mean coordinate of the overlapping voxel classified as a planar surface. The residual is minimized along the normal vector of the plane. The hybrid observation weight is defined by the planarity of the voxel expressed through its eigen-transformed variance. The image rays are, unlike LiDAR pulses, incapable of penetrating certain structures such as vegetation, as images only observe the closest point to the camera. Thus, the voxel means are mostly biased estimates of the tie-point positions. This effectively makes only planar voxels suitable for hybrid observations.

3.4. Parameter Estimation and Correction

Once the observations for the adjustment have been formed between LP image lines, frame images, LiDAR voxels, and tie-points and voxels, these observations are connected to the estimation parameters through the linearization of the respective functional models. The models are referred to as the pinhole camera model, the LiDAR voxel model, and the hybrid voxel–tie-point model, as defined in [13]. Common for all these models is that they build upon the same trajectory correction modeling, where corrections are estimated only at certain time steps and interpolated using cubic splines.

The trajectory parameters are stochastically constrained using the initial full covariance matrices of the GNSS/INS solution. This helps to control the stability and reliability of the cubic spline trajectory model.

The observation weights are the inverse of the observation covariance matrices for the respective observation types multiplied by the dynamic covariance scaling; see [29,30].

3.4.1. Pinhole Camera Model

The image observations to use in the adjustment are the same tie-points in object space identified in multiple images as key-point correspondences. The pinhole model expresses the relationship between a point in the object and image spaces (Equation (1)) and is expanded to be differentiable with respect to the parameters in a kinematic mapping platform (Equation (2)). Although the traditional pinhole camera creates a frame image, using the cubic spline model of the trajectory corrections also allows using the pinhole model for LP image lines [14]. The residual to minimize in the least squares estimation process is known as the reprojection error, i.e., the residual between the measured image coordinates and the image coordinates reprojected from the tie-point in object space (Figure 7).

{\hat{p}}_{[k]}^{c} = \frac{c^{c}}{p_{[k], z}^{c}} [\begin{matrix} p_{[k], x}^{c} \\ p_{[k], y}^{c} \end{matrix}] + γ

(1)

${\hat{p}}_{[k]}^{c}$ represents the coordinates of the point $[k]$ in the camera frame c ( $2 \times 1$ vector). This may be expanded to account for additional image corrections such as lens distortions or the principal point offset.
$c^{c}$ is the principal distance of the camera in the camera frame (scalar).
$p_{[k]}^{c}$ is the $3 \times 1$ vector between the camera projection center and the object point in the camera frame.
$γ$ represents additional image corrections; that is, the nonlinear distortion and the principal point. See [31].

p_{[k]}^{c} = β_{p}^{c} {(R_{p}^{m} (t))}^{- 1} (p_{[k]}^{m} - x_{p}^{m} (t) - R_{p}^{m} (t) l^{p})

(2)

$x_{p}^{m} (t)$ is the platform position in the map frame m at the time $(t)$ of measurement ( $3 \times 1$ vector).
$R_{p}^{m} (t)$ is the rotation matrix from the platform frame p to the map frame at the time of measurement ( $3 \times 3$ matrix).
$β_{p}^{c}$ is the camera boresight matrix ( $3 \times 3$ matrix), i.e., the rotation matrix from the platform frame to the camera frame.
$p_{[k]}^{m}$ represents the coordinates of the point in the map frame ( $3 \times 1$ vector).
$l^{p}$ is the lever arm from the platform origin to the camera optical center expressed in the platform frame ( $3 \times 1$ vector).

Several spatial and temporal factors affect the image observation uncertainty, such as the key-point detection method, uncertainties in the lens distortion and atmospheric refraction models, and motion blur. Thus, the precision estimates of the image observations are infeasible to obtain and are commonly predefined; for example, as a 1/3 pixel.

3.4.2. LiDAR Voxel Model

The LiDAR voxel model expresses the difference between the mean coordinates of two overlapping voxels of the same class (Figure 8). The model is based on the direct georeferencing equation, which gives the coordinates

x_{[i]}^{m}

of a georeferenced LiDAR point

[i]

:

x_{[i]}^{m} = x_{p}^{m} (t) + R_{p}^{m} (t) l_{s}^{p} + R_{p}^{m} (t) {(β_{p}^{s})}^{- 1} α_{[i]}^{s} d_{[i]}

(3)

$l_{s}^{p}$ is the lever arm of scanner s in the platform frame ( $3 \times 1$ vector).
$β_{p}^{s}$ is the scanner boresight matrix ( $3 \times 3$ matrix), i.e., the rotation matrix from the platform frame to the scanner frame s.
$α_{[i]}^{s}$ is the line-of-sight unit vector of the LiDAR point ( $3 \times 1$ vector).
$d_{[i]}$ is the range measurement of the LiDAR point (scalar). This may be expanded to account for additional interior scanner parameters, such as the range–scale factor and the range bias.

Since the individual LiDAR points are not stored in the voxels

α

and d from Equation (3) and are instead computed from the scanner-to-voxel geometry based on the mean position of the points within the voxel, the voxel mean time is used to retrieve the time-dependent

x_{p}^{m}

and

R_{p}^{m}

via the cubic spline interpolation of the trajectory corrections.

The weights of LiDAR voxel observations are the inverse of the sum of their voxel covariance matrices. These covariance matrices describe the spatial dispersion of points within the confined voxel.

3.4.3. Hybrid Voxel–Tie-Point Model

Similar to the LiDAR voxel model, the hybrid voxel–tie-point model is also based on Equation (3). However, the voxel–tie-point model instead expresses the difference between the mean coordinate of a voxel classified as a planar surface and the tie-point located inside it (Figure 9). These observations are used to estimate the trajectory, LiDAR scanner, and tie-point corrections.

The weights of the hybrid observations are the inverse of the covariance matrix of the overlapping voxel. However, the relationship between voxels from LiDAR point clouds and tie-points from images is not fully described by the voxel means and covariances. This could for example occur where the LiDAR point sampling is too sparse to capture nonplanar objects where tie-points from images are located. Thus, the hybrid voxel variance has to be inflated.

4. Experimental Results

An experiment was conducted to test the method for multisensor matching from LP images, frame images, and LiDAR point clouds. The experiment was based on a data set acquired from an airplane over the Norwegian University of Life Sciences in Ås, Norway (59.665°N, 10.775°E), in October 2022. The data set originally consisted of 24 flight strips flown at two altitudes as presented in [14]. However, a reduced experiment with four flight strips was conducted here to imitate a more common operational photogrammetric block, where the flight time should be minimal while still providing enough observations for estimating the sensor and trajectory parameters. The planned data acquisition variables and used sensors are shown in Table 1. Unlike the experiment in [14], we introduced both LiDAR data and frame images to the joint adjustment with the LP image lines. All data were captured on the same flight with a dual-hatch airplane, where the LiDAR scanner and frame camera were installed with an INS in the fore hatch. The LP camera was installed in the aft hatch with a separate INS. The flight efficiency was mainly limited by the narrow FoV and minimum operational exposure time of the LP camera. The camera settings are shown in Table 2. As the used LP HSI camera captures 186 spectral bands, chromatic aberration should be considered. However, this effect is at a sub-pixel level for the HySpex VNIR-1800 camera series used [14,32], and thus only a single spectral band at wavelength 592 nm was used in the experiment.

The LiDAR scanner settings of the precisely precalibrated dual-channel system are shown in Table 3. Common lever-arm corrections (three parameters) and boresight angle corrections (three parameters) were estimated for the two LiDAR channels. Corrections to the boresight angles (three parameters), principal distance (one parameter), radial distortion (three parameters), and tangential distortion (two parameters) were estimated for the frame camera. For the LP camera in the aft hatch, corrections to the boresight angles (three parameters), principal distance (one parameter), radial distortion (two parameters), and tangential distortion (two parameters) were estimated. The constant distance between the two platform pivot points was constrained to a measured constant distance. The trajectory parameters were estimated as corrections to the position (three parameters per time segment) and orientation (three parameters per time segment). Additionally, corrections to each tie-point in object space were estimated. The positional standard deviation (STD) of the processed GNSS/INS solution used as the input for the adjustment was ≤1.3 cm in each horizontal dimension and <2.0 cm in height.

The accuracy after adjustment was assessed using 17 white 1.2 m × 1.2 m reference squares as the groundtruth data. They were placed on the ground within the survey area (Figure 10). Four of these were used as ground control points (GCPs) for reference and 13 as checkpoints (CPs) for accuracy assessment. This accuracy assessment was conducted by comparing the estimated coordinates of the CP center points to their measured coordinates. The groundtruth centers were precisely measured on the same day as the aerial data acquisition using real-time kinematic (RTK) GNSS positioning on two to three repeated visits. The RTK-based reference data had an estimated precision of approximately 1 cm from the mean position of the repeated visits, which was used as the groundtruth coordinate. All the reference squares had a baseline of <1 km to the nearest permanent GNSS reference station. The locations of the ground reference data are shown together with the flight strips in Figure 11.

The distribution of the tie-points is shown in Figure 12. These points were used to form either image observations or hybrid observations to link the LiDAR-derived voxel data to the images. Not shown in the figure are the additional voxel observations formed between overlapping voxels.

The LiDAR voxel size was set to 3 m for both experiments for a sufficient statistical representation of the points in a voxel. The observation weight for LiDAR voxel observations is given by the covariance of the voxel; however, the observation weight from the image observations was set to 1/3 pixel for the frame camera and 1/2 pixel for the LP camera.

Trajectory segmentation time steps of 1, 2, 5, 10, 20, and 30 s were tested to estimate the trajectory corrections (Figure 13). This time step was evaluated by analyzing the RMSE of the height component of the CP errors for the LiDAR scanner.

Table 4 shows the initial CP error statistics of the LP camera, frame camera, and LiDAR scanner before adjustment. An initial aerotriangulation was performed from the images based on the a priori sensor states to produce these error statistics. The precisely precalibrated LiDAR scanner resulted in cm-accurate point clouds. The median of the STD of the LiDAR points used to compute the initial error statistics was 3.0 cm. This describes the precision of the LiDAR measurements on the CPs.

Table 5 shows the CP error statistics of the LP camera, frame camera, and LiDAR scanner for the experiment with a trajectory segmentation time step of 10 s. Hybrid observations were not formed for these CPs. On the other hand, they were formed for the other tie-points located within a voxel classified as a planar surface. The error statistics are shown as the minimum, maximum, median, interquartile range (IQR), and root mean square error (RMSE). The IQR is approximately 1.35 STDs under the assumption that the data are normally distributed. However, the IQR is robust to outliers. Only the height error is shown for the LiDAR scanner, as the distance along the normal vector of the reference squares was used as the error metric between the groundtruth data and the LiDAR voxels. The median of the STD of the LiDAR points was 1.1 cm.

5. Discussion

The presented method is an extension of earlier work on the hybrid adjustment of images and LiDAR point clouds [13] and the bundle adjustment of LP HSI [14]. However, we presented a method showing how to use observations from a LiDAR scanner, frame camera, and LP camera together in a joint hybrid adjustment. This is, to our knowledge, the first published work showing the possible combination of data from these three sensor types to jointly estimate the trajectory and sensor corrections using a common trajectory correction model. Under the assumption that the observations from the different sensors are weighted correctly, the increased number of observations stemming from all these sensors conceptually ensures a better determined system than using the sensor measurements in separate adjustment methods. The limited number of unknown parameters when using the trajectory cubic spline model and the voxelization of the LiDAR point cloud ensures scalability. Additionally, the joint adjustment of data from multiple sensors gives the optimal matching solution for all the used data.

5.1. Experiment Design and Accuracy

In the experiment, the frame camera was installed in the fore hatch in the airplane together with the LiDAR scanner (Table 1). Thus, observations from these two sensors contribute to estimating the same trajectory parameters using the same modeling of the trajectory corrections. However, the experiment consisted of two platforms that could rotate independently. Thus, the two platforms had separate trajectories with unknown orientation errors. Even though a distance constraint was introduced between the platforms, both trajectories had to be corrected. Thus, only part of the hybrid adjustment potential from data from the three sensor types was fully utilized in the presented experiment. Only the distance was set as a constraint between the platforms, as the angular readings of the gyro mounts are one order of magnitude less accurate than the orientations from the two IMUs. This distance can be measured with mm to cm precision, unlike a constraint on the relative 3D positions between the platforms. The latter would require a transformation through the vehicle body frame and the additional consideration of the gyro-mount precision. Thus, unlike with the constraint on the distance between the platforms, it is neither feasible to use the gyro-mount information to constrain the relative orientation nor the 3D relative position of the two platforms.

The tie-points from the frame camera and those used to generate hybrid observations were evenly distributed in radiometrically inhomogeneous areas in object space (Figure 12). The LP camera had a very small lateral overlap (Table 2), which essentially led to tie-points being formed only in the areas where the three parallel flight strips overlapped the crossing flight strip.

The frame camera parameters are well-estimated and provide accurate results for this camera. The LiDAR scanner and frame camera combined provide accurate measurements in all three dimensions, and the platform distance constraint helps to estimate the temporal positional error for the LP camera.

The results show that the planimetric error is of sub-pixel level, whereas the poor base-to-height relation makes the CP height error large when measured from the LP camera. The adjustment of the frame images is very accurate (Table 5). Thus, the effect of including the hybrid observations between tie-points from the frame images and the LiDAR point cloud is relatively limited, since the photogrammetric block is very strong with significant overlap and a good base-to-height ratio (Table 2). The experiment was conducted with four GCPs; however, the joint adjustment effectively minimized the residuals between the observations derived from different sensors. Thus, the GCPs are only needed to ensure good absolute accuracy, while the relative consistency between the image and LiDAR data is unaffected by the GCPs. The lifted requirements for dedicated LiDAR ground-control data has been pointed out as a benefit of joint hybrid adjustment in previous studies, e.g., [3,33], and is a major advantage compared to processing the data in separate LiDAR strip adjustment and image bundle adjustment per sensor.

There are clear theoretical advantages of combining sensor data from all modalities at the observation level. The current experiment does not allow for an in-depth analysis of the precision and accuracy under different conditions, and serves only to verify the practical application of the suggested approach.

5.2. Observation Retrieval

The observations to be used in the adjustment must be retrieved from the images and the LiDAR point cloud to conduct the joint adjustment. The hybrid observations between the cameras and LiDAR scanner are important to ensure consistent geometric matching between the data from the different sensors.

One main advantage of conducting the joint hybrid adjustment of LiDAR and image data is the ability to estimate temporal trajectory corrections in geometric or radiometric homogeneous areas in object space as long as distinct features exist in one of these domains; the LiDAR scanner will provide valuable observations for the adjustment based on the geometric features, whereas the cameras will provide observations based on the radiometric salinity. For airborne platforms, such as in this experiment, the LiDAR scanning will result in an abundant number of observations, effectively representing the ground level. On the other hand, the cameras will provide observations effectively describing the planimetric discrepancies.

The hybrid observations from airborne platforms, such as in this experiment, often only describe the height errors. This is mainly owing to the limited number of hybrid observations being formed on non-horizontal planes from the aerial data acquisition. As the residual is minimized along the normal vector for planes detected in LiDAR voxels, the observability of parameters explaining the planimetric errors in the LiDAR point cloud is limited. This is also true for the voxel observations used to match LiDAR point clouds from different flight lines to each other. As only very few vertical surfaces are observed from the airborne LiDAR scanner, only the inclined surfaces (e.g., inclined roofs) and indistinct voxels contribute to minimizing the planimetric error. Even though the observations formed from indistinct voxels will contribute to estimating the parameters to minimize the planimetric error, these observations have a relatively low weighting compared to the planar surfaces as expressed by their respective covariance matrices. The exact voxel size has earlier been shown in [13] to not be critical for the method, as long as it is reasonably chosen to include planar surfaces. The 0.3 cm vertical LiDAR RMSE with a segmentation time step of 10 s shown in Figure 13 indicates that the ground level was geometrically well-defined from LiDAR measurements. However, vertical structures are difficult to detect from airborne LiDAR scanning, which limits both the observability of parameters connected to the horizontal displacement and the possibility of addressing the horizontal LiDAR point cloud accuracy. All in all, the photogrammetric block ensures sub-pixel planimetric accuracy and the combination of LiDAR and photogrammetric adjustment compensate each other to increase the accuracy in all three dimensions.

It is challenging to accurately form observations between RGB frame images and LP HSI. The LP cameras are often used for HSI, which typically have entirely different camera properties than standard RGB frame cameras (e.g., spectral bandwidth, spectral response, and chromatic aberration). Thus, an analysis of the spectral bands in the HSI camera to imitate the spectral signal as recorded by the RGB camera would be necessary if salient key points in two images from the two different camera types were to be confidently matched. Alternatively, the approach proposed in the experiment is robust to the different camera types (i.e., HSI and RGB cameras), as the image observations are retrieved separately in different preprocessing steps for the two cameras (Figure 3).

Some hybrid observations are outliers, just as with any other observation material from real-world data. Tie-points from images are located in image neighborhoods with distinct radiometry, which are often found on geometrically sharp structures relative to the surrounding planar surface. An example of this from the presented experiment is shown in Figure 14. The LiDAR point cloud is not always dense enough, nor is the footprint small enough, to precisely measure and express such distinct geometric structures through the voxel covariance matrix. Thus, the hybrid observation variance should in these cases be inflated to prevent over-optimistic observation weighting.

Hybrid observations between the tie-points and LiDAR voxels help to estimate the image depth, where the observations of the tie-points are found in the image. However, the accurate estimation of parameters describing the image depth, mostly the principal distance, still depends on the view geometry between the image rays observing a tie-point in object space. Furthermore, for an airborne platform, the limited LiDAR view angle limits the ability to correct horizontal errors with LiDAR scanners. Consequentially, very few tie-points located on vertical structures are used to form hybrid observations. As the tie-point positions in object space are initially estimated from the image observations (Figure 3), they need to be observed in at least two images of cameras with similar radiometric properties. This further limits the number of hybrid observations, as some distinct radiometric image measurements may be visible in only a single image. However, the alternative ray-casting on the LiDAR point cloud required for computing the initial tie-point coordinates is a computationally expensive operation.

5.3. Trajectory Segmentation

The trajectory errors are known to change to such a degree that they degrade the resulting LiDAR scans for a flight strip, e.g., [22]. Hence, a higher-order trajectory correction model is appropriate. For cameras, several images are often taken per second; thus, the trajectory parameterization per image exposure is often an over-parameterization. Additionally, it is common for several cameras to be installed on the same platform (e.g., in oblique imaging). In such cases, the method proposed here takes advantage of all the observations from all the used sensors to estimate the same trajectory corrections.

Figure 13 shows that the choice of the trajectory segmentation time step is not critical to achieving accurate results. However, the shorter trajectory segmentation time steps generally offer higher accuracy with increased model complexity.

Normally, the number of retrieved observations is extremely high compared to the required parameters when using a trajectory segmentation time step of some seconds; thus, the method is robust to the choice of this time step. Using a similar approach, a trajectory segmentation time step of ≤10 s has been shown to provide accurate results for UAVs; see [22].

Certain sensor parameters, such as the boresight correction angles of the LP camera, are challenging to estimate correctly without a rigorous trajectory error model that considers the temporal changes in trajectory errors. When using a trajectory time segmentation as presented, any arbitrary sensor used for kinematic mapping can strengthen the trajectory error estimation as long as the functional model is known, the observations are reliable, and precise time stamps are provided for the data.

6. Future Work

Additional experiments should be conducted with a wide-angle lens on the LP camera to improve the observability of the principal distance correction. For a future experiment, all sensors should be integrated on the same platform to estimate the same trajectory parameters from all observations. The joint adjustment of sensor data from different platforms than an airplane (e.g., UAVs or cars) would be an interesting extension of the presented work, as this could offer different parameter observability. Matching between the descriptors from HSI and RGB cameras can be investigated further to ensure tight integration between the two sensors and an increased number of observations for the adjustment. It would also be interesting to examine the possibility of using the trajectory error formulation for matching data from other sensors for kinematic mapping (e.g., ground-penetrating radar or bathymetric echo sounding systems).

7. Conclusions

A novel theoretical development was presented, and an experiment was conducted to show that a scalable joint hybrid adjustment of data from LiDAR scanners, frame cameras, and LP cameras is achievable. To the authors’ knowledge, this is the first time the joint matching from observations from these three modalities has been demonstrated. In such an adjustment, the observations from the three sensors contribute to estimating the same corrections to the trajectory and sensor interior orientations. This allows for the use of observations from the different sensors directly in the joint adjustment, rather than adjusting the data in a subsequent process. Tie-points can be constrained to lie on planar surfaces from the LiDAR data, thus leading to increased height accuracy. The planimetric accuracies were 1/7 RMSE of the ground sampling distance (GSD) for the frame images and 1/2 RMSE of the GSD for the LP images in each of the two planimetric dimensions. In addition to the sensors used here, the general trajectory error formulation allows for observations from any arbitrary sensor with a known functional model, reliable observations, and precise time stamps to be included in the joint adjustment of the data from kinematic mapping platforms.

Author Contributions

Conceptualization, V.O.J. and N.S.K.; methodology, V.O.J. and N.S.K.; software, V.O.J. and N.S.K.; validation, V.O.J., N.S.K. and L.E.B.; formal analysis, V.O.J., N.S.K. and L.E.B.; investigation, V.O.J.; resources, V.O.J., N.S.K., L.E.B. and J.G.O.G.; data curation, V.O.J.; writing—original draft preparation, V.O.J., N.S.K., L.E.B. and J.G.O.G.; writing—review and editing, V.O.J., N.S.K., L.E.B. and J.G.O.G.; visualization, V.O.J.; supervision, N.S.K., L.E.B. and J.G.O.G.; project administration, V.O.J., N.S.K., L.E.B. and J.G.O.G.; funding acquisition, V.O.J., N.S.K. and L.E.B. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by The Research Council of Norway grant number 311356 and Field Geospatial AS project number 12828.

Data Availability Statement

Restrictions apply to the availability of these data. The data were obtained from Field Geospatial AS and are available from the authors with the permission of Field Geospatial AS.

Conflicts of Interest

Vetle O. Jonassen, Narve S. Kjørsvik, and Leif Erik Blankenberg are employees of Field Geospatial AS. Field Geospatial AS conducted the aerial data acquisition. Beyond this, the funding sponsors had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:

AGL	Above ground level
CP	Checkpoint
DSM	Digital surface model
GCP	Ground control point
GNSS	Global navigation satellite system
GSD	Ground sampling distance
HSI	Hyperspectral imaging
IMU	Inertial measurement unit
INS	Inertial navigation system
IQR	Interquartile range
LiDAR	Light detection and ranging
LP	Linear pushbroom
RANSAC	Random sample consensus
RMSE	Root mean square error
RTK	Real-time kinematic
STD	Standard deviation
UAV	Unmanned aerial vehicle

References

Pentek, Q.; Kennel, P.; Allouis, T.; Fiorio, C.; Strauss, O. A flexible targetless LiDAR-GNSS/INS-camera calibration method for UAV platforms. ISPRS J. Photogramm. Remote Sens. 2020, 166, 294–307. [Google Scholar] [CrossRef]
Zhou, T.; Hasheminasab, S.M.; Habib, A. Tightly-coupled camera/LiDAR integration for point cloud generation from GNSS/INS-assisted UAV mapping systems. ISPRS J. Photogramm. Remote Sens. 2021, 180, 336–356. [Google Scholar] [CrossRef]
Haala, N.; Kölle, M.; Cramer, M.; Laupheimer, D.; Zimmermann, F. Hybrid georeferencing of images and LiDAR data for UAV-based point cloud collection at millimetre accuracy. ISPRS Open J. Photogramm. Remote Sens. 2022, 4, 100014. [Google Scholar] [CrossRef]
Sankey, T.; Donager, J.; McVay, J.; Sankey, J.B. UAV lidar and hyperspectral fusion for forest monitoring in the southwestern USA. Remote Sens. Environ. 2017, 195, 30–43. [Google Scholar] [CrossRef]
Cledat, E.; Skaloud, J. Fusion of Photo with Airborne Laser Scanning. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 1, 173–180. [Google Scholar] [CrossRef]
Kuras, A.; Brell, M.; Rizzi, J.; Burud, I. Hyperspectral and Lidar Data Applied to the Urban Land Cover Machine Learning and Neural-Network-Based Classification: A Review. Remote Sens. 2021, 13, 3393. [Google Scholar] [CrossRef]
Kuras, A.; Heincke, B.H.; Salehi, S.; Mielke, C.; Köllner, N.; Rogass, C.; Altenberger, U.; Burud, I. Integration of Hyperspectral and Magnetic Data for Geological Characterization of the Niaqornarssuit Ultramafic Complex in West-Greenland. Remote Sens. 2022, 14, 4877. [Google Scholar] [CrossRef]
Ren, Z.; Zhai, Q.; Sun, L. A Novel Method for Hyperspectral Mineral Mapping Based on Clustering-Matching and Nonnegative Matrix Factorization. Remote Sens. 2022, 14, 1042. [Google Scholar] [CrossRef]
Trier, Ø.D.; Salberg, A.B.; Kermit, M.; Rudjord, Ø.; Gobakken, T.; Næsset, E.; Aarsten, D. Tree species classification in Norway from airborne hyperspectral and airborne laser scanning data. Eur. J. Remote Sens. 2018, 51, 336–351. [Google Scholar] [CrossRef]
Allen, B.; Dalponte, M.; Ørka, H.O.; Næsset, E.; Puliti, S.; Astrup, R.; Gobakken, T. UAV-Based Hyperspectral Imagery for Detection of Root, Butt, and Stem Rot in Norway Spruce. Remote Sens. 2022, 14, 3830. [Google Scholar] [CrossRef]
Jonassen, V.O.; Aarsten, D.; Kailainathan, J.; Maalen-Johansen, I. Urban Blue-Green Factor Estimation in Fredrikstad, Norway from Hyperspectral and LiDAR Remote Sensing Data Fusion—A Concept Study. In Proceedings of the 10th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing, Amsterdam, The Netherlands, 24–26 September 2019; pp. 1–5. [Google Scholar] [CrossRef]
Lu, B.; Dao, P.; Liu, J.; He, Y.; Shang, J. Recent Advances of Hyperspectral Imaging Technology and Applications in Agriculture. Remote Sens. 2020, 12, 2659. [Google Scholar] [CrossRef]
Jonassen, V.O.; Kjørsvik, N.S.; Gjevestad, J.G.O. Scalable Hybrid Adjustment of Images and LiDAR Point Clouds. ISPRS J. Photogramm. Remote Sens. 2023, 202, 652–662. [Google Scholar] [CrossRef]
Jonassen, V.O.; Ressl, C.; Pfeifer, N.; Kjørsvik, N.S.; Gjevestad, J.G.O. Bundle Adjustment of Aerial Linear Pushbroom Hyperspectral Images with Sub-Pixel Accuracy. PFG—J. Photogramm. Remote Sens. Geoinf. Sci. 2024, 92, 1–13. [Google Scholar] [CrossRef]
Kilian, J.; Haala, N.; Englich, M. Capture and evaluation of airborne laser scanner data. In Proceedings of the International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Vienna, Austria, 9–19 July 1996; Kraus, K., Waldhäusl, P., Eds.; ISPRS: Vienna, Austria, 1996; Volume XVII-B3, pp. 383–388. [Google Scholar]
Glira, P.; Pfeifer, N.; Briese, C.; Ressl, C. Rigorous Strip Adjustment of Airborne Laserscanning Data Based on the Icp Algorithm. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, II-3/W5, 73–80. [Google Scholar] [CrossRef]
Triggs, B.; McLauchlan, P.F.; Hartley, R.I.; Fitzgibbon, A.W. Bundle Adjustment—A Modern Synthesis. In Vision Algorithms: Theory and Practice. IWVA 1999. Lecture Notes in Computer Science; Triggs, B., Zisserman, A., Szeliski, R., Eds.; Springer: Berlin/Heidelberg, Germany, 2000; Volume 1883, pp. 298–372. [Google Scholar] [CrossRef]
Förstner, W.; Wrobel, B.P. Photogrammetric Computer Vision; Springer: Cham, Switzerland, 2016. [Google Scholar] [CrossRef]
Pöppl, F.; Neuner, H.; Mandlburger, G.; Pfeifer, N. Integrated trajectory estimation for 3D kinematic mapping with GNSS, INS and imaging sensors: A framework and review. ISPRS J. Photogramm. Remote Sens. 2023, 196, 287–305. [Google Scholar] [CrossRef]
Cucci, D.A.; Rehak, M.; Skaloud, J. Bundle adjustment with raw inertial observations in UAV applications. ISPRS J. Photogramm. Remote Sens. 2017, 130, 1–12. [Google Scholar] [CrossRef]
Tan, Z.; Zhang, X.; Teng, S.; Wang, L.; Gao, F. A Review of Deep Learning-Based LiDAR and Camera Extrinsic Calibration. Sensors 2024, 24, 3878. [Google Scholar] [CrossRef] [PubMed]
Glira, P.; Pfeifer, N.; Mandlburger, G. Rigorous Strip Adjustment of UAV-based Laserscanning Data Including Time-Dependent Correction of Trajectory Errors. Photogramm. Eng. Remote Sens. 2016, 82, 945–954. [Google Scholar] [CrossRef]
Glira, P.; Pfeifer, N.; Mandlburger, G. Hybrid Orientation of Airborne Lidar Point Clouds and Aerial Images. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2019, 4, 567–574. [Google Scholar] [CrossRef]
Mishra, R.K.; Zhang, Y. A Review of Optical Imagery and Airborne LiDAR Data Registration Methods. Open Remote Sens. J. 2012, 5, 54–63. [Google Scholar] [CrossRef]
Kjørsvik, N.S.; Øvstedal, O.; Gjevestad, J.G.O. Kinematic Precise Point Positioning During Marginal Satellite Availability. In Proceedings of the Observing our Changing Earth. International Association of Geodesy Symposia; Sideris, M.G., Ed.; Springer: Berlin/Heidelberg, Germany, 2009; Volume 133, pp. 691–699. [Google Scholar] [CrossRef]
Rosten, E.; Drummond, T. Machine Learning for High-Speed Corner Detection. In Proceedings of the Computer Vision—ECCV 2006, Graz, Austria, 7–13 May 2006; Leonardis, A., Bischof, H., Pinz, A., Eds.; Springer: Berlin/Heidelberg, Germany, 2006; pp. 430–443. [Google Scholar] [CrossRef]
Leutenegger, S.; Chli, M.; Siegwart, R.Y. BRISK: Binary Robust Invariant Scalable Keypoints. In Proceedings of the International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2548–2555. [Google Scholar] [CrossRef]
Fischler, M.A.; Bolles, R.C. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
Agarwal, P.; Tipaldi, G.D.; Spinello, L.; Stachniss, C.; Burgard, W. Robust map optimization using dynamic covariance scaling. In Proceedings of the International Conference on Robotics and Automation, Karlsruhe, Germany, 6–10 May 2013; pp. 62–69. [Google Scholar] [CrossRef]
MacTavish, K.; Barfoot, T.D. At all Costs: A Comparison of Robust Cost Functions for Camera Correspondence Outliers. In Proceedings of the 12th Conference on Computer and Robot Vision, Halifax, NS, Canada, 3–5 June 2015; pp. 62–69. [Google Scholar] [CrossRef]
Brown, D.C. Decentering Distortion of Lenses. Photogramm. Eng. 1966, 32, 444–462. [Google Scholar]
Torkildsen, H.E.; Skauli, T. Full characterization of spatial coregistration errors and spatial resolution in spectral imagers. Opt. Lett. 2018, 43, 3814–3817. [Google Scholar] [CrossRef] [PubMed]
Haala, N.; Kölle, M.; Cramer, M.; Laupheimer, D.; Mandlburger, G.; Glira, P. Hybrid Georeferencing, Enhancement and Classification of Ultra-High Resolution Uav Lidar and Image Point Clouds for Monitoring Applications. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2020, 2, 727–734. [Google Scholar] [CrossRef]

Figure 1. A frame camera records radiometry in 2D grids, whereas an LP camera records it in consecutive 1D lines as the view plane swipes over the imaged area. Ground footprints of the last image exposures are shown with gray lines. LiDAR scanners measure the range and angle of the pulses that are emitted from the LiDAR scanner and reflected from the terrain.

Figure 2. Simplified workflow of the proposed method showing the preprocessing, initialization, voxel generation, and parameter estimation and correction steps.

Figure 3. Detailed workflow of the proposed method.

Figure 4. Subset of an LP image scene showing parts of a running track with four lanes. The original LP image scenes (a) are compensated for rotation (b). The general direction of the platform motion is downwards in the figures. Reproduced from [14].

Figure 5. Example of a colorized LiDAR point cloud with marked cross-section (a) and its profile (b), which is discretized in classified voxels (c).

Figure 6. Hybrid observations are used to minimize the residual (arrows) between the image tie-points (red points) and the voxel mean positions along the normal vector of the plane. The blue and green points represent the ground and vegetation LiDAR points, respectively. Voxel–tie-point observations are formed for voxels classified as planar surfaces (black boxes) but not for other voxels (orange boxes).

Figure 7. The reprojection error (red arrow) to minimize in the adjustment from the pinhole model is computed from the difference between the measured key-point coordinate and the image coordinate reprojected from the tie-point in 3D object space.

Figure 8. The residual to minimize in the adjustment from the LiDAR voxel model is the difference between the mean coordinates of two overlapping voxels. The voxel covariance matrices are used in the observation weighting.

Figure 9. The residual to minimize in the adjustment from the hybrid voxel–tie-point model is the difference between the mean coordinate of the voxel classified as a planar surface and the tie-point located inside it. The voxel covariance matrix is used in the observation weighting.

Figure 10. Example of a 1.2 m × 1.2 m reference square as seen from the ground (a) in an RGB frame image (b) and a true color composite LP image scene from the HSI camera (c); (a,c) are reproduced from [14].

Figure 11. Flight strips where the data were acquired (blue lines) with indicated flight heading (blue arrows), GCPs (red triangles), CPs (yellow circles), and the permanent GNSS reference station (green square).

Figure 12. Distribution of the tie-points from the frame images (yellow), LP images (blue), and those used for generating hybrid observations (red).

Figure 13. CP height errors from the LiDAR point cloud (black) and the trajectory correction model complexity (red) for different trajectory segmentation time steps.

Figure 14. Orthoimage from the RGB frame camera showing a parking lot with cars overlaid with tie-points used as hybrid observations in the adjustment. Some tie-points are located on well-defined planar surfaces (yellow), while others are located on geometric anomalies in generally planar areas, such as cars (red). The pixel size of the orthoimage is 15 cm.

Table 1. Data acquisition variables and sensors used in the experiment.

Position	Description	Variables and Sensors
	Number of flight strips	4
	Flight altitude (AGL)	1875 m
	Flying speed	67 m/s
Fore hatch	INS	Applanix POS-AV 610
	LiDAR scanner	RIEGL VQ-1560-II-S
	Frame camera	Phase One iXM-RS150F
Aft hatch	INS	Applanix POS-AV 510
Aft hatch	LP camera	HySpex VNIR-1800

Table 2. Camera settings and planned photogrammetric block variables from the experiment.

	RGB Frame Camera	LP HSI Camera
Model	Phase One iXM-RS150F	HySpex VNIR-1800
Image size (columns × rows)	14,204 × 10,652	1800 × 1
Image resolution	$3.8$ $μ$ $m$	$6.5$ $μ$ $m$
Exposure time	1.25 ms	4.40 ms
Field of view	54.8° × 42.5°	16.6°
Image footprint on ground	1943.8 m × 1458.3 m	547.1 m × 0.3 m
Image overlap (lateral × forward)	74% × 60%	10% × 0%
Ground sampling distance	14 cm	30 cm
Number of image exposures	17	$30,514$

Table 3. LiDAR scanner settings from the experiment.

	LiDAR Scanner Setting
Model	RIEGL VQ-1560-II-S
Number of channels	2
Pulse repetition rate per channel	1 MHz
Field of view	58.5°
Ground swath	2101 m
Lateral scan overlap	76%
Number of points	$1.9 \times 10^{8}$
Point density per flight line	11 points/m²

Table 4. Initial error statistics (m) of the 13 CPs from the experiment before adjustment.

Sensor	Dim.	Min.	Max.	Median	IQR	RMSE
Frame camera	East	3.12	6.40	4.74	1.67	4.81
	North	−0.21	3.67	1.39	2.19	2.27
	Height	−13.26	2.58	−1.51	6.45	6.56
LP camera	East	−8.75	0.69	−3.53	6.56	4.90
	North	−6.42	2.73	−1.23	4.85	3.52
	Height	−223.83	205.50	−56.66	216.34	123.57
LiDAR scanner	Height	0.04	0.05	0.04	0.01	0.04

Table 5. Error statistics (m) of the 13 CPs from the experiment. The trajectory segmentation time step was set to 10 s.

Sensor	Dim.	Min.	Max.	Median	IQR	RMSE
Frame camera	East	−0.04	0.01	0.01	0.02	0.02
	North	−0.04	0.04	0.00	0.02	0.02
	Height	−0.10	−0.02	−0.05	0.02	0.06
LP camera	East	−0.21	0.24	−0.03	0.25	0.15
	North	−0.18	0.29	0.04	0.19	0.15
	Height	−2.67	2.76	0.91	1.95	1.66
LiDAR scanner	Height	−0.01	0.01	0.00	0.00	0.00

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Jonassen, V.O.; Kjørsvik, N.S.; Blankenberg, L.E.; Gjevestad, J.G.O. Aerial Hybrid Adjustment of LiDAR Point Clouds, Frame Images, and Linear Pushbroom Images. Remote Sens. 2024, 16, 3179. https://doi.org/10.3390/rs16173179

AMA Style

Jonassen VO, Kjørsvik NS, Blankenberg LE, Gjevestad JGO. Aerial Hybrid Adjustment of LiDAR Point Clouds, Frame Images, and Linear Pushbroom Images. Remote Sensing. 2024; 16(17):3179. https://doi.org/10.3390/rs16173179

Chicago/Turabian Style

Jonassen, Vetle O., Narve S. Kjørsvik, Leif Erik Blankenberg, and Jon Glenn Omholt Gjevestad. 2024. "Aerial Hybrid Adjustment of LiDAR Point Clouds, Frame Images, and Linear Pushbroom Images" Remote Sensing 16, no. 17: 3179. https://doi.org/10.3390/rs16173179

APA Style

Jonassen, V. O., Kjørsvik, N. S., Blankenberg, L. E., & Gjevestad, J. G. O. (2024). Aerial Hybrid Adjustment of LiDAR Point Clouds, Frame Images, and Linear Pushbroom Images. Remote Sensing, 16(17), 3179. https://doi.org/10.3390/rs16173179

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Aerial Hybrid Adjustment of LiDAR Point Clouds, Frame Images, and Linear Pushbroom Images

Abstract

1. Introduction

2. Related Work

2.1. Trajectory Correction Modeling

2.2. Hybrid Adjustment

2.3. Proposed Approach

3. Methodology

3.1. Preprocessing

3.1.1. GNSS/INS Processing

3.1.2. Point Cloud Georeferencing

3.1.3. Observation Retrieval from Frame Images

3.1.4. Observation Retrieval from LP Image Lines

3.2. Initialization

3.3. Voxel Generation

3.4. Parameter Estimation and Correction

3.4.1. Pinhole Camera Model

3.4.2. LiDAR Voxel Model

3.4.3. Hybrid Voxel–Tie-Point Model

4. Experimental Results

5. Discussion

5.1. Experiment Design and Accuracy

5.2. Observation Retrieval

5.3. Trajectory Segmentation

6. Future Work

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI