Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence

Kim, Jae-In; Kim, Taejung

doi:10.3390/s16030412

Open AccessArticle

Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence

by

Jae-In Kim

and

Taejung Kim

^*

Department of Geoinformatic Engineering, Inha University, 100 Inharo, Nam-gu Incheon 22212, Korea

^*

Author to whom correspondence should be addressed.

Sensors 2016, 16(3), 412; https://doi.org/10.3390/s16030412

Submission received: 13 February 2016 / Revised: 15 March 2016 / Accepted: 16 March 2016 / Published: 22 March 2016

(This article belongs to the Section Remote Sensors)

Download

Browse Figures

Versions Notes

Abstract

:

Epipolar resampling is the procedure of eliminating vertical disparity between stereo images. Due to its importance, many methods have been developed in the computer vision and photogrammetry field. However, we argue that epipolar resampling of image sequences, instead of a single pair, has not been studied thoroughly. In this paper, we compare epipolar resampling methods developed in both fields for handling image sequences. Firstly we briefly review the uncalibrated and calibrated epipolar resampling methods developed in computer vision and photogrammetric epipolar resampling methods. While it is well known that epipolar resampling methods developed in computer vision and in photogrammetry are mathematically identical, we also point out differences in parameter estimation between them. Secondly, we tested representative resampling methods in both fields and performed an analysis. We showed that for epipolar resampling of a single image pair all uncalibrated and photogrammetric methods tested could be used. More importantly, we also showed that, for image sequences, all methods tested, except the photogrammetric Bayesian method, showed significant variations in epipolar resampling performance. Our results indicate that the Bayesian method is favorable for epipolar resampling of image sequences.

Keywords:

epipolar resampling; image rectification; Bayesian approach; stereo image sequence

1. Introduction

Epipolar resampling is the procedure of eliminating Y parallax, or vertical disparity, between a stereo image pair. This procedure is critical both for stereo image processing and for three-dimensional (3D) content generation. For stereo image processing, this can convert two-dimensional (2D) correspondence search problems into one-dimensional (1D) ones and hence enhance automated depth map generation [1,2,3]. For 3D content generation, this can eliminate visual fatigue and produce high quality 3D perception [4]. In addition, this can enhance the processing of various stereovision systems such as mobile robots or smart vehicles.

Epipolar resampling has been studied extensively in the fields of computer vision and photogrammetry. In computer vision, epipolar resampling is achieved by a homographic transformation for sending epipoles of original images to infinity [5,6,7,8,9,10,11]. Depending on how the homography is estimated, epipolar resampling methods can be classified into two approaches: uncalibrated and calibrated cases. The uncalibrated approach estimates the homography from the fundamental matrix, which is determined using tie-points between a stereo image pair. The calibrated approach estimates the homography from known intrinsic and extrinsic parameters of stereo images. Typically, these parameters are acquired by a stereo calibration method using calibration patterns [12,13].

In photogrammetry, epipolar resampling is performed by a perspective transformation that aligns epipolar lines with horizontal image lines. This transformation is defined by relative orientation parameters between the two images. Collinear or coplanar equations are used to estimate the parameters mathematically [14,15,16,17]. In a typical photogrammetric application, intrinsic parameters are assumed to be known. Relative orientation parameters, which are extrinsic parameters of the right image with respect to the left image frame, are estimated by tie-points.

Epipolar resampling developed in each field is serving its own purpose and application. In computer vision, for example, epipolar resampling is used for robot vision and fast image processing [18,19,20]. In photogrammetry, it is mainly used for precise image rectification for stereo plotters [14,21,22]. For serving new and challenging applications, there is an increasing need for merging techniques developed in computer vision and photogrammetry. However, evaluation of epipolar resampling methods developed from each field using a common dataset, comparison of their performances and understanding of their differences in theoretical and practical perspectives are missing.

In this paper, we aim to compare epipolar resampling methods developed in computer vision and photogrammetry. In particular, we aim to apply epipolar resampling not for rectifying one stereo pair, but for rectifying a sequence of stereo images. We argue that most previous investigations were for epipolar resampling of a single image pair and that epipolar resampling of image sequences has not been studied thoroughly. Here we focus on epipolar resampling of image sequences with an intension of stereo applications where two cameras are installed at separate platforms and move independently.

We will first review the general principles and formulations of epipolar resampling developed in the computer vision field. We will then review the epipolar resampling developed in photogrammetry. We will re-confirm the well-known principle that epipolar resampling methods developed in computer vision and in photogrammetry are mathematically identical. We will also point out practical differences between them. We will then explain image sequences and the representative epipolar resampling methods used for tests. Finally, we report and compare their performance over a single image pair and over image sequences.

2. Epipolar Resampling in Computer Vision

Epipolar geometry between a stereo pair is explained in Figure 1. The two perspective centers (

C_{1}

and

C_{2}

in the figure) and a ground point

P

define an epipolar plane. Epipolar lines

l_{1}

and

l_{2}

are the intersection between the epipolar plane and the left and right image planes, respectively. Epipoles

e_{1}

and

e_{2}

are the intersections between the line connecting the two perspective centers and the left and right image planes.

For perspective images, any corresponding left and right image points

q_{1}

and

q_{2}

satisfy the following matrix equation [23]:

q_{2}^{T} F q_{1} = 0

(1)

where

q_{1} (u_{1}, v_{1}, 1)

and

q_{2} (u_{2}, v_{2}, 1)

are homogenous image coordinates of the left and right image points (

p_{1}

and

p_{2}

) and

F

the fundamental matrix. Epipolar lines

l_{1}

and

l_{2}

and epipoles

e_{1}

and

e_{2}

can be found by

F

:

l_{1} = F^{T} q_{2}, l_{2} = F q_{1}, F e_{1} = 0, F^{T} e_{2} = 0

(2)

Epipolar resampling can be achieved by a homographic transformation of mapping the epipole of the original image to a point at infinity. The homography in the case of the uncalibrated approach can be determined from the fundamental matrix as

H = G A T

(3)

where

T

is a translation taking the principal point (

c_{x}, c_{y}

) of the image to the origin of the coordinate system,

A

is a rotation about the origin taking the epipole to a point

{(k, 0, 1)}^{T}

on the x-axis, and

G

is a transformation taking the moved epipole to the point at infinity

{(k, 0, 0)}^{T}

as shown below [8].

T = [\begin{matrix} 1 & 0 & - c_{x} \\ 0 & 1 & - c_{y} \\ 0 & 0 & 1 \end{matrix}], G = [\begin{matrix} 1 & 0 & 0 \\ 0 & 1 & 0 \\ - 1 / k & 0 & 1 \end{matrix}]

(4)

Once the homographies are calculated, then optimization is generally carried out to align corresponding epipolar lines between stereo images [7,9,24].

Under the uncalibrated approach, the fundamental matrix can be estimated from tie-points without any additional information. Therefore, this approach enables automation over the whole process and the use of images from an unknown source. There are many studies regarding the reliable estimation of the fundamental matrix and the reliable reconstruction of epipolar geometry. Among them, included were Hartley’s normalized eight-point algorithm [25], and algorithms that apply additional constraints such as algebraic minimization [26], minimization of epipolar distance [8,24], and other geometric cost functions [27,28,29].

The uncalibrated approach may, however, be sensitive to tie-point noise and prone to image distortion as the transformation is not based on physical geometry between the two images. Fusiello and Irsara [30] tried to overcome these problems by proposing a new uncalibrated approach called quasi-Euclidean epipolar rectification. They estimated stereo geometry using non-linear equations by minimizing geometric re-projection errors.

In the calibrated approach, intrinsic and extrinsic parameters for the left and right images are estimated separately by the stereo calibration method using dedicated calibration patterns [12,13,31,32]. Extrinsic parameters are used to estimate the relative geometric relationship between the two images. The homography for epipolar resampling is then defined by the relative geometric relationship.

Assuming

R_{1}

and

R_{2}

are 3 × 3 rotation matrices containing rotational extrinsic parameters of the left and right cameras and

C_{1}

and

C_{2}

are 3 × 1 translation matrices containing extrinsic parameters related to the location of the perspective center, the relative rotation

R

and translation

B

between two images are defined as below [33]:

R = R_{2} R_{1}^{T} = [\begin{matrix} r_{11} & r_{12} & r_{13} \\ r_{21} & r_{22} & r_{23} \\ r_{31} & r_{32} & r_{33} \end{matrix}], B = R_{1} (C_{2} - C_{1}) = [\begin{matrix} b_{x} \\ b_{y} \\ b_{z} \end{matrix}]

(5)

The homography,

H_{1}

for the left and

H_{2}

for the right image, can be estimated by two rotations. The first rotation,

R_{half}^{1}

and

R_{half}^{2}

, is distributed from the relative rotation

R

to make the overlap between the two images reach maximum. The second rotation

R_{rect}

transforms the epipole to infinity using the baseline vector

B

:

H_{1} = R_{rect} R_{half}^{1}, H_{2} = R_{rect} R_{half}^{2}

(6)

where

R = R_{half}^{1} {(R_{half}^{2})}^{T}

,

R_{rect} = {[\begin{matrix} m_{1}^{T} & m_{2}^{T} & m_{3}^{T} \end{matrix}]}^{T}

and

m_{1}

,

m_{2}

and

m_{3}

are defined as

m_{1} = \frac{m}{| | m | |}, m_{2} = \frac{{[\begin{matrix} - b_{y} & b_{x} & 0 \end{matrix}]}^{T}}{\sqrt{b_{x}^{2} + b_{y}^{2}}}, m_{3} = m_{1} \times m_{2}

(7)

As mentioned above, while the calibrated approach guarantees minimized image distortion and high accuracy, it requires a prior calibration process. Accordingly, this approach potentially involves a limitation in terms of availability.

3. Photogrammetric Epipolar Resampling

In this section, we review the epipolar resampling developed in photogrammetry. While its mathematical formulation is well reported [17], we re-formulate photogrammetric camera models here in order to compare them more directly with the ones developed in computer vision. In photogrammetry, the geometric relationship between a ground point

P

and its left and right image points,

q_{1}

and

q_{2}

, is explained by the collinear equation below:

q_{1} = M_{1} p_{1} = λ_{1} M_{1} R_{1} P_{1}, q_{2} = M_{2} p_{2} = λ_{2} M_{2} R_{2} P_{2}

(8)

where λ is a scale factor,

R_{1}

and

R_{2}

are 3 × 3 rotation matrices defined as before.

P_{1} = \vec{C_{1} P}

,

P_{2} = \vec{C_{2} P}

, and

p_{1} (x_{1}, y_{1}, f_{1})

and

p_{2} (x_{2}, y_{1}, f_{2})

are look vectors of the left and right camera frames from the perspective center to projection point.

M_{1}

and

M_{2}

are camera matrices of the left and right images for converting camera coordinates to image coordinates as follows:

M_{1} = [\begin{matrix} f_{1} & 0 & c_{x_{1}} \\ 0 & f_{1} & c_{y_{1}} \\ 0 & 0 & 1 \end{matrix}], M_{2} = [\begin{matrix} f_{2} & 0 & c_{x_{2}} \\ 0 & f_{2} & c_{y_{2}} \\ 0 & 0 & 1 \end{matrix}]

(9)

where

c_{x}

and

c_{y}

are defined as Equation (4) and

f

is the focal length of the camera.

The geometric relationship between the stereo images can be explained by the condition that the baseline vector

S = \vec{C_{1} C_{2}}

, the left look vector

P_{1}

and the right look vector

P_{2}

are coplanar. This coplanar condition can be expressed by the scalar triple product among the three vectors as below.

P_{2}^{T} \cdot [S \times P_{1}] = P_{2}^{T} {[S]}_{\times} P_{1} = 0, {[S]}_{\times} = [\begin{matrix} 0 & - s_{z} & s_{y} \\ s_{z} & 0 & - s_{x} \\ - s_{y} & s_{x} & 0 \end{matrix}]

(10)

Taking the left camera axes as the reference frame, i.e.,

R_{1} = I_{3 \times 3}

and

C_{1} = 0_{3 \times 1}

, the above coplanar equation can be re-written in the form of the fundamental matrix as shown below:

P_{2}^{T} {[S]}_{\times} P_{1} = p_{2}^{T} R {[B]}_{\times} p_{1} = 0

(11)

q_{2}^{T} {(M_{2}^{- 1})}^{T} R {[B]}_{\times} M_{1}^{- 1} q_{1} = q_{2}^{T} F q_{1} = 0

(12)

where

R

and

B

are the relative rotation and translation of the right camera with respect to the left image frame [17]. As shown in Equation (12), it is well known that the coplanar equation used in photogrammetry is mathematically identical to the fundamental matrix equation used in computer vision. The main difference in the uncalibrated and photogrammetric approaches is that the former estimates the whole eight parameters of the fundamental matrix from tie-points whereas the latter estimates relative rotational and translational parameters.

Epipolar resampling is carried out by two perspective transformations derived from the extrinsic parameters [17,34]. First, the right image plane is re-projected parallel to the reference frame (left camera axes) using the rotation angles of the right camera as shown in Figure 2a. Secondly, the two image planes are then re-projected to align with the baseline connecting two perspective centers, and to remove Y parallaxes as shown in Figure 2b.

These two perspective transformations can also be expressed by homography as

H_{1} = R_{base} R_{1}^{T}, H_{2} = R_{base} R_{2}^{T}

(13)

where

R_{base}

is the second perspective transformation to align with respect to the baseline and is defined from the rotation angles of the baseline for X, Y and Z axes as

R_{base} = R_{Ω} R_{Φ} R_{Κ} = [\begin{matrix} 1 & 0 & 0 \\ 0 & cos Ω & \sin Ω \\ 0 & - \sin Ω & cos Ω \end{matrix}] [\begin{matrix} cos Φ & 0 & - \sin Φ \\ 0 & 1 & 0 \\ \sin Φ & 0 & cos Φ \end{matrix}] [\begin{matrix} cos Κ & \sin Κ & 0 \\ - \sin Κ & cos Κ & 0 \\ 0 & 0 & 1 \end{matrix}]

(14)

Ω = \frac{ω_{1} + ω_{2}}{2}, Φ = - {tan}^{- 1} \frac{b_{z}}{\sqrt{b_{x}^{2} + b_{y}^{2}}}, Κ = {tan}^{- 1} \frac{b_{y}}{b_{z}}

(15)

As shown in Equations (13)–(15), photogrammetric epipolar resampling is mathematically identical to that under the calibrated approach. The main difference in the calibrated and photogrammetric approaches is that the former calculates relative orientation parameters from left and right extrinsic parameters, which are estimated by calibration patterns, whereas the latter estimates relative orientation parameters directly from tie-points.

The photogrammetric approach may also be sensitive to tie-point noise. In order to overcome the sensitivity problem in epipolar resampling, one may apply the Bayesian approach. The Bayesian approach is a popular estimation technique that uses a priori and a posteriori error statistics of unknown parameters [35]. In photogrammetric epipolar resampling, one can limit the convergence range of the extrinsic parameters by setting a priori error statistics of the parameters. One can also reduce the effects of tie-point errors on the overall estimation by setting a priori error statistics of the tie-point measurements. Advantages of the Bayesian approach are that it can make constraints of both the observation equations and the estimation parameters, and that the constraints contribute to the determination of the unknowns as weights. If the a priori constraints for unknowns are defined from actual geometric characteristics of the camera arrangement, this approach should work more consistently over stereo image sequences and more robustly over tie-point errors. For many photogrammetric applications, the constraints can be determined from sensors’ specification or operational practice [36,37].

4. Datasets and Methodology

For comparative analysis of the epipolar resampling methods, four stereo image sequences (TEST01, TEST02, TEST03, and TEST04) were obtained by two webcams (Microsoft Webcam Cinema). The four image sequences were designed so that error factors were increased from TEST01 to TEST04. Two sequences (TEST01 and TEST02) were obtained in a low resolution mode with an image size of 320 × 240 pixels and two (TEST03 and TEST04) in a high resolution mode with 640 × 480 pixels. Each sequence was obtained with different baseline distances and different camera tilt angles (see Table 1). This was to check the effect of wider baseline and larger tilt angle on epipolar resampling performance. For each sequence, 100 stereo pairs were acquired without camera movement to analyze the consistency of geometry estimation with respect to change of tie-points among different pairs.

For estimation of epipolar geometry, tie-points were extracted using the scale invariant feature transform (SIFT) algorithm [38] and the outlier removal algorithm [39] based on random sample consensus (RANSAC). Tolerances for outlier removal were set to one pixel for TEST01 and three pixels for the other datasets. The tolerance of one pixel was chosen to set the optimal case, where tie-points had little position error. The tolerance of three pixels was chosen to set the practical case since this value was most widely used in practice. This aims to analyze the robustness against tie-point errors. Additional tie-points were extracted manually at 10 locations from individual image sequences. These points were used to check the accuracy of epipolar resampling. Figure 3 shows one stereo pair from the four image sequences and the 10 tie-points extracted manually.

In experiments, five existing epipolar resampling methods were tested: three developed in computer vision and two in photogrammetry. Table 1 lists out the methods used for the experiments. Note that these methods were chosen as representative methods of the different approaches mentioned in Section 2 and Section 3. “Bouguet” indicates the calibrated approach based on the homographies in Equation (6) developed by Bouguet [40]. “Hartley” is the normalized eight-point algorithm with geometric error minimization based on the Sampson cost function, which is one of the most well-known uncalibrated methods developed in the computer vision field [8]. “Fusiello” is the quasi-Euclidean epipolar rectification developed by Fusiello and Irsara [30]. These two methods belong to the uncalibrated approaches. “Kim” represents a photogrammetric epipolar resampling method based on the relative orientation explained in Section 3 [16]. “Bayesian” is a photogrammetric epipolar resampling with the Bayesian approach for estimating relative orientation parameters, as explained in Section 3.

For the first three methods, we used software available publicly, and for the fourth and fifth methods, we implemented the algorithms in-house. The Bouguet method requires a calibration process determining the intrinsic and extrinsic parameters. For this, 10 pairs of chessboard patterns were used. The intrinsic parameters acquired from this process were also used in the Kim and Bayesian methods.

As mentioned earlier, the Bayesian method needs to determine a priori constraints for estimation parameters and tie-point measurements. These constraints will contribute to correct determination of the unknown parameters by limiting their convergence range in the least-squares sense. We set the a priori covariance values for tie-points to one pixel in consideration of the expected rectification error and the expected accuracy of tie-points obtained from the SIFT and outlier removal algorithms. We set the a priori covariance values related to orientation parameters to the square of 30° in consideration of the expected variation of camera angles from the ideal case. We set the a priori covariance values for the baseline vector (

b_{y}

and

b_{z}

) to one in consideration of the expected deviation of the baseline vector from the ideal case when assuming

b_{x} = 100

. For different experiment settings, different covariance values would be required. For our case, we set relatively large angular covariance values and relatively small positional covariance values. This was because we installed two cameras on a stereo rig and rotated them manually.

A total of four performance measures were used to compare the methods quantitatively. For the evaluation of epipolar geometry, rectification errors (

E_{r}

) and epipolar line fluctuation (

E_{v}

) were used and for the evaluation of image distortion, orthogonality (

E_{o}

) and relative scale change in the result images (

E_{a}

) were used [11]. Rectification errors refer to the residual of Y parallaxes in the result image and are measured as the difference of transformed vertical coordinates of tie-points in the left and right images. Epipolar line fluctuation was measured in pixels as standard deviations of the transformed image coordinates over the whole image sequence. In the ideal case, rectification errors and epipolar line fluctuation should be zero. Orthogonality was measured as the angles of the x and y axes after epipolar resampling, which should be 90° ideally. Relative scale change was measured as the ratio of the two diagonal lines after epipolar resampling, which should be one ideally. For more details regarding performance measures, refer to [11].

5. Results of Performance Evaluation

The experiments were carried out by firstly using single pairs and then using image sequences. In the two experiments, the results of the Bouguet method were used as ground truth since they were acquired by calibration patterns. Table 1 summarizes the performance comparison of the five methods applied to single pairs. For the other methods except the Bouguet method, we chose 10 consecutive frames from each image sequence, accumulated tie-points from the 10 frames and estimated a single transformation matrix. Note that due to the limit of the matrix dimension, we chose 10 instead of all frames. Among the four accuracy parameters, epipolar line fluctuation (

E_{v}

) was not included because this is related to an image sequence only. The actual number of tie-points from the 10 frames used for epipolar resampling is also shown in Table 1. In order to better interpret experiment results, distortion measurements were also represented as differences (

Δ E_{o}

and

Δ E_{a}

) with those of the Bouguet method.

In the results of TEST01 and TEST02 datasets, all methods showed satisfactory results in terms of rectification errors. These results demonstrate that although mismatched tie-points are partly included, acceptable transformation accuracy can be produced if the camera configuration is favorable. However, in the case of TEST03 and TEST04 datasets, rectification errors were increased for all test methods. In particular, a large error increase was observed for the uncalibrated epipolar resampling methods, Hartley and Fusiello. This observation agrees well with previous investigations showing that the uncalibrated methods were more prone to tie-point errors in weak stereo geometry [10,11]. The uncalibrated methods also show larger image distortion. This distortion was observed over the whole datasets compared to the results of the Bouguet method. In this experiment, the two photogrammetric methods, Kim and Bayesian, showed almost identical results. It can be interpreted as that effect of tie-point errors was handled better than in the uncalibrated methods by estimating relative orientation parameters directly. A slight rectification error increase for the Kim and Bayesian was observed with TEST04 due to the wide baseline and large tilt angle.

Table 2 summarizes the performance comparison of the epipolar resampling methods applied to all 100 stereo pairs of the four test datasets. Note that the Bouguet was not tried with image sequences since it uses identical homography over all pairs within the same image sequence. For this experiment, we did not accumulate tie-points of consecutive images. We treated each frame independently by estimating a transformation matrix, generating an epipolar resampled pair and calculating the transformation accuracy per frame. Table 2 shows the average number of tie-points used for each image sequence. The accuracy parameters in Table 2 were the average (“Mean” in the Table) and standard deviation (Stdev) of the parameters for the 100 frames.

Firstly, we notice that the performance for an image sequence is different from that of single pairs. For a single pair, all methods tested showed small differences among themselves, whereas for image sequences their performance differed significantly. All methods showed variations over image sequences in rectification errors, image distortion errors and epipolar line fluctuation. In particular, epipolar line fluctuation was very significant for all but the Bayesian method for all test datasets. These results imply inconsistent geometry estimation according to the change of the tie-points among each frame and that those methods may not be suitable for handling epipolar resampling of image sequences. Among those tested, the Bayesian method showed the least changes over image sequences. This result demonstrates that by limiting convergence ranges of orientation parameters, stable geometry estimation was achieved. This property of the Bayesian approach may be favorable for image sequences. Figure 4 shows the fluctuation of the rectification errors for all image frames of TEST02 for comparison of the consistency.

Secondly, we can see the tendency of accuracy degradation due to weak stereo geometry, which is similar to the result for the single pair experiment. By comparing TEST01 and TEST02 results, we can check that accuracy has been degraded with larger tie-point errors. Nevertheless, the Bayesian method showed the least degradation. By comparing TEST02 and TEST03, we can observe that the rectification and epipolar line fluctuation errors have been increased with a larger image size and hence with a smaller pixel size in an identical camera. By comparing TEST03 and TEST04, we can observe that the errors have also increased with larger misalignment between two images.

Thirdly, all four errors in Table 2 were increased compared to those of Table 1 for all methods tested. This was anticipated because the number of tie-points used for epipolar resampling in Table 2 was smaller than that in Table 1. However, the Bayesian method produced the least accuracy degradation. This observation may also impose favor to the Bayesian method in case of a smaller number of tie-points.

In addition, we checked the performance differences in the methods tested by comparing the resampled image sequences visually. Figure 5 shows a portion of the resampled left and right images of TEST02 from each method. Each column shows the consecutive resampled image pairs from a particular method. The white horizontal lines represent the epipolar lines of the resampled images. We can check the effects of image distortion and inconsistent epipolar geometry estimations in the result images. The resampled sequences with inconsistent epipolar geometry estimations cannot form a stereoscopic image sequence due to abrupt scene changes. The last column showed that the Bayesian method could handle an image sequence with consistency. An explicit comparison of execution time was not carried out because each method tested was under different circumstances. The Hartley and Kim methods were optimized for fast execution (about 32 and 26 frames/s, respectively, on a platform with CPU i5-4460, clock speed 3.20 GHz and memory size 8 GB). The Fusiello and Bayesian methods were not optimized for speed. The Fusiello method was available only in Matlab executables and was very slow (about 2 frames/s). The Bayesian method used very large-sized matrices without efficient matrix decomposition for parameter estimation and was also slow (about 13 frames/s) Reduction of processing time for Fusiello or Bayesian was, however, deemed outside of this paper’s scope and was not tried.

6. Conclusions

In this paper, epipolar resampling methods developed in computer vision and photogrammetry were analyzed in terms of rectification errors, image distortion and stability of epipolar geometry. The analysis was focused on the resampling of image sequences.

From the theoretical review, we pointed out that while the epipolar resampling methods developed in two fields are mathematically identical, their performance in parameter estimation may be different.

From the analysis of experiment results, we showed that for epipolar resampling of a single image pair all uncalibrated and photogrammetric methods could be used and, however, that for image sequences all methods tested, except the Bayesian method, showed significant variations in terms of image distortions and epipolar line fluctuation among image pairs within the same sequence. Our results indicate that the Bayesian method is favorable for epipolar resampling of image sequences. These results imply that although differences among the epipolar resampling methods may have been un-noticeable in resampling a single pair, they may not be in resampling image sequences.

As a future study, we will investigate the cause of such variations, ways to minimize them and how to remove remaining small variations in the Bayesian method. Optimization of the processing speed of the Bayesian method will also be studied. Recently, the rapid change of image sources and the diversification of relevant applications are calling for the merge of computer vision and photogrammetric technologies. We hope that our investigation can contribute to the understanding of epipolar resampling technologies developed in both fields and to the development of new stereo applications.

Acknowledgments

This research was supported by the “Development of the integrated data processing system for GOCI-II” funded by the Ministry of Ocean and Fisheries, Korea.

Author Contributions

Jae-In Kim designed the work, performed the experiments and wrote the manuscript; Taejung Kim supervised all of the process. He designed the experiments and wrote the manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kim, T. A Study on the Epipolarity of Linear Pushbroom Images. Photogramm. Eng. Remote Sens. 2000, 66, 961–966. [Google Scholar]
Lee, H.; Kim, T.; Park, W.; Lee, H. Extraction of Digital Elevation Models from Satellite Stereo Images through Stereo Matching Based on Epipolarity and Scene Geometry. Image Vision Comput. 2003, 21, 789–796. [Google Scholar] [CrossRef]
Noh, M.; Cho, W.; Bang, K. Highly Dense 3D Surface Generation using Multi-Image Matching. ETRI J. 2012, 34, 87–97. [Google Scholar] [CrossRef]
Kang, Y.; Ho, Y. An Efficient Image Rectification Method for Parallel Multi-Camera Arrangement. IEEE Trans. Consum. Electron. 2011, 57, 1041–1048. [Google Scholar] [CrossRef]
Al-Shalfan, K.A.; Haigh, J.G.B.; Ipson, S.S. Direct Algorithm for Rectifying Pairs of Uncalibrated Images. Electron. Lett. 2000, 36, 419–420. [Google Scholar] [CrossRef]
Chen, Z.; Wu, C.; Tsui, H.T. A New Image Rectification Algorithm. Pattern Recogn. Lett. 2003, 24, 251–260. [Google Scholar] [CrossRef]
Hartley, R.I. Theory and Practice of Projective Rectification. Int. J. Comput. Vision 1999, 35, 115–127. [Google Scholar] [CrossRef]
Hartley, R.I.; Zisserman, A. Multiple View Geometry in Computer Vision, 2nd ed.; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
Loop, C.; Zhang, Z. Computing Rectifying Homographies for Stereo Vision. In Proceedings of the 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 1999), Fort Collins, CO, USA, 23–25 June 1999; pp. 125–131.
Seo, J.; Hong, H.; Jho, C.; Choi, M. Two Quantitative Measures of Inlier Distributions for Precise Fundamental Matrix Estimation. Pattern Recogn. Lett. 2004, 25, 733–741. [Google Scholar] [CrossRef]
Mallon, J.; Whelan, P.F. Projective Rectification from the Fundamental Matrix. Image Vision Comput. 2005, 23, 643–650. [Google Scholar] [CrossRef]
Zhang, Z. A Flexible New Technique for Camera Calibration. IEEE Trans. Pattern Anal. Mach. Intell. 2000, 22, 1330–1334. [Google Scholar] [CrossRef]
Bok, Y.; Ha, H.; Kweon, I. Automated Checkerboard Detection and Indexing Using Circular Boundaries. Pattern Recogn. Lett. 2016, 71, 66–72. [Google Scholar] [CrossRef]
Schenk, T. Digital Photogrammetry; Terra Science: Laurelville, OH, USA, 1999. [Google Scholar]
Wolf, P.; DeWitt, B. Elements of Photogrammetry with Applications in GIS, 3rd ed.; McGraw-Hill Science: New York, NY, USA, 2000. [Google Scholar]
Kim, J.; Kim, H.; Lee, T.; Kim, T. Photogrammetric Approach for Precise Correction of Camera Misalignment for 3D Image Generation. In Proceedings of the 2012 IEEE International Conference on Consumer Electronics (ICCE 2012), Las Vegas, NE, USA, 13–16 January 2012; pp. 396–397.
McGlone, J.C.; Mikhail, E.M.; Bethel, J. (Eds.) Manual of Photogrammetry, 5th ed.; American Society of Photogrammetry and Remote Sensing: Bethesda, MD, USA, 2004.
Lazaros, N.; Sirakoulis, G.C.; Gasteratos, A. Review of Stereo Vision Algorithms: From Software to Hardware. Int. J. Optomech. 2008, 2, 435–462. [Google Scholar] [CrossRef]
Zilly, F.; Kluger, J.; Kauff, P. Production Rules for Stereo Acquisition. IEEE Proc. 2011, 99, 590–606. [Google Scholar] [CrossRef]
Guerra, E.; Munguia, R.; Grau, A. Monocular SLAM for Autonomous Robots with Enhaced Features Initialization. Sensors 2014, 14, 6317–6337. [Google Scholar] [CrossRef] [PubMed]
Adams, L.P. Fourcade: The Centenary of a Stereoscopic Method of Photographic Surveying. Photogramm. Rec. 2001, 17, 225–242. [Google Scholar] [CrossRef]
Wang, M.; Hu, F.; Li, J. Epipolar Resampling of Linear Pushbroom Satellite Imagery by a New Epipolarity Model. ISPRS J. Photogramm. Remote Sens. 2011, 66, 347–355. [Google Scholar] [CrossRef]
Longuet-Higgins, H.C. A Computer Algorithm for Reconstructing a Scene from Two Projections. Nature 1981, 293, 133–135. [Google Scholar] [CrossRef]
Wu, H.; Yu, Y.H. Projective Rectification with Reduced Geometric Distortion for Stereo Vision and Stereoscopic Video. J. Intell. Rob. Syst. Theor. Appl. 2005, 42, 71–94. [Google Scholar] [CrossRef]
Hartley, R.I. In Defense of the Eight-Point Algorithm. IEEE Trans. Pattern Anal. Mach. Intell. 1997, 19, 580–593. [Google Scholar] [CrossRef]
Hartley, R.I. Minimizing Algebraic Error in Geometric Estimation Problems. In Proceedings of the 1998 IEEE 6th International Conference on Computer Vision, Bombay, India, 4–7 January 1998; pp. 469–476.
Banno, A.; Ikeuchi, K. Estimation of F-Matrix and Image Rectification by Double Quaternion. Inf. Sci. 2012, 183, 140–150. [Google Scholar] [CrossRef]
Fathy, M.E.; Hussein, A.S.; Tolba, M.F. Fundamental Matrix Estimation: A Study of Error Criteria. Pattern Recogn. Lett. 2011, 32, 383–391. [Google Scholar] [CrossRef]
Kumar, S.; Micheloni, C.; Piciarelli, C.; Foresti, G.L. Stereo Rectification of Uncalibrated and Heterogeneous Images. Pattern Recogn. Lett. 2010, 31, 1445–1452. [Google Scholar] [CrossRef]
Fusiello, A.; Irsara, L. Quasi-Euclidean Epipolar Rectification of Uncalibrated Images. Mach. Vision Appl. 2011, 22, 663–670. [Google Scholar] [CrossRef]
Brown, D.C. Close- Range Camera Calibration. Photogramm. Eng. 1971, 37, 855–866. [Google Scholar]
Tsai, R.Y. A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology using Off-the-Shelf TV Cameras and Lenses. IEEE J. Robot. Automat. 1987, 3, 323–344. [Google Scholar] [CrossRef]
Medioni, G.; Kang, S. Emerging Topics in Computer Vision; Prentice Hall PTR: Upper Saddle River, NJ, USA, 2004. [Google Scholar]
Cho, W.; Schenk, T.; Madani, M. Resampling Digital Imagery to Epipolar Geometry. Int. Arch. Photogramm. Remote Sens. 1993, 29, 404–408. [Google Scholar]
Koch, K. Introduction to Bayesian Statistics, 2nd ed.; Springer: Berlin, Germany, 2007. [Google Scholar]
Agouris, P.; Schenk, T. Automated Aerotriangulation using Multiple Image Multipoint Matching. Photogramm. Eng. Remote Sens. 1996, 62, 703–710. [Google Scholar]
Grodecki, J.; Dial, G. Block Adjustment of High-Resolution Satellite Images Described by Rational Polynomials. Photogramm. Eng. Remote Sens. 2003, 69, 59–68. [Google Scholar] [CrossRef]
Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vision 2004, 60, 91–110. [Google Scholar] [CrossRef]
Fischler, M.A.; Bolles, R.C. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
Bouguet’s Camera Calibration Toolbox. Available online: http://www.vision.caltech.edu/bouguetj/calib_doc/ (accessed on 12 February 2016).

Figure 1. Epipolar geometry of stereo image.

Figure 2. Procedure of epipolar transformation based on dependent relative orientation. (a) transformation for reprojecting the right image plane parallel to reference frame, (b) transformation for reprojecting two image planes to align with the baseline.

Figure 3. Image datasets used and the 10 tie-points extracted manually for performance evaluation. (a) TEST01, (b) TEST02, (c) TEST03, (d) TEST04.

Figure 4. Rectification errors for each frame of TEST02 dataset.

Figure 5. A portion of the results from the consecutive rectified images of TEST02. From left, (a) original image and results of the (b) Hartley, (c) Fusiello, (d) Kim, and (e) Bayesian methods.

Table 1. Results of performance evaluation for single pairs chosen from each image sequence (C, U and P indicate calibrated, uncalibrated and photogrammetric approaches, respectively).

**Table 1.** Results of performance evaluation for single pairs chosen from each image sequence (C, U and P indicate calibrated, uncalibrated and photogrammetric approaches, respectively).
Dataset			Method	$E_{r}$ [pixel]	$E_{o} (Δ E_{o}$ ) [°]	$E_{a}$ ( $Δ E_{a}$ )
TEST01	Image size	320 × 240	(C) Bouguet	0.52	90.01 (0.00)	1.00 (0.00)
	Baseline width	Narrow	(U) Hartley	0.46	91.77 (1.76)	1.02 (0.02)
	Camera tilt angle	Small	(U) Fusiello	0.48	89.91 (−0.10)	1.00 (0.00)
	Tie-point errors	Small	(P) Kim	0.35	89.99 (−0.02)	1.00 (0.00)
	Tie-points used	858	(P) Bayesian	0.37	90.01 (0.00)	1.00 (0.00)
TEST02	Image size	320 × 240	(C) Bouguet	0.64	90.07 (0.00)	1.00 (0.00)
	Baseline width	Narrow	(U) Hartley	0.63	90.73 (0.66)	1.01 (0.01)
	Camera tilt angle	Medium	(U) Fusiello	0.59	90.52 (0.45)	1.01 (0.01)
	Tie-point errors	Large	(P) Kim	0.61	90.09 (0.02)	1.00 (0.00)
	Tie-points used	480	(P) Bayesian	0.43	90.11 (0.04)	1.00 (0.00)
TEST03	Image size	640 × 480	(C) Bouguet	0.66	90.07 (0.00)	1.00 (0.00)
	Baseline width	Narrow	(U) Hartley	3.70	89.48 (−0.59)	0.99 (−0.01)
	Camera tilt angle	Medium	(U) Fusiello	3.26	89.98 (−0.09)	1.00 (0.00)
	Tie-point errors	Large	(P) Kim	0.56	90.07 (0.00)	1.00 (0.00)
	Tie-points used	500	(P) Bayesian	0.61	90.06 (−0.01)	1.00 (0.00)
TEST04	Image size	640 × 480	(C) Bouguet	0.73	90.38 (0.00)	1.01 (0.00)
	Baseline width	Wide	(U) Hartley	1.51	90.22 (−0.16)	1.00 (−0.01)
	Camera tilt angle	Large	(U) Fusiello	1.26	90.44 (0.06)	1.01 (0.00)
	Tie-point errors	Large	(P) Kim	1.09	90.33 (−0.05)	1.01 (0.00)
	Tie-points used	277	(P) Bayesian	1.13	90.35 (−0.03)	1.01 (0.00)

Table 2. Results of performance evaluation for all stereo pairs of the image sequences.

**Table 2.** Results of performance evaluation for all stereo pairs of the image sequences.
Dataset	Method	$E_{r}$ [pixel]		$E_{o} (Δ E_{o}$ ) [°]		$E_{a}$ ( $Δ E_{a}$ )		$E_{v}$ [pixel]
Dataset	Method	Mean	Stdev	Mean	Stdev	Mean	Stdev	$E_{v}$ [pixel]
TEST01 Tiepoints used: 98 points/frame	(U) Hartley	0.50	0.14	91.98 (1.97)	0.74	1.02 (0.02)	0.01	3.69
	(U) Fusiello	0.50	0.06	89.90 (−0.11)	0.09	1.00 (0.00)	0.00	4.13
	(P) Kim	0.38	0.05	89.98 (−0.03)	0.02	1.00 (0.00)	0.00	4.44
	(P) Bayesian	0.38	0.02	90.01 (0.00)	0.00	1.00 (0.00)	0.00	0.57
TEST02 Tiepoints used: 57 points/frame	(U) Hartley	0.71	0.86	89.71 (−0.36)	2.07	1.00 (0.00)	0.06	7.49
	(U) Fusiello	0.66	0.61	90.43 (0.36)	0.74	1.00 (0.00)	0.01	10.59
	(P) Kim	0.60	0.55	90.23 (0.16)	0.10	1.00 (0.00)	0.00	14.05
	(P) Bayesian	0.43	0.29	90.09 (0.02)	0.02	1.00 (0.00)	0.00	0.50
TEST03 Tiepoints used: 58 points/frame	(U) Hartley	1.58	1.36	91.25 (1.18)	2.18	1.01 (0.01)	0.02	5.45
	(U) Fusiello	1.62	1.50	90.36 (0.29)	0.70	1.00 (0.00)	0.00	17.72
	(P) Kim	1.10	1.24	90.10 (0.03)	0.03	1.00 (0.00)	0.00	5.24
	(P) Bayesian	0.77	0.55	90.07 (0.00)	0.01	1.00 (0.00)	0.00	1.68
TEST04 Tiepoints used: 29 points/frame	(U) Hartley	1.92	2.24	90.01 (−0.37)	3.44	1.00 (−0.01)	0.02	4.65
	(U) Fusiello	2.09	2.13	90.54 (0.16)	0.29	1.01 (0.00)	0.00	5.84
	(P) Kim	2.66	3.18	90.37 (−0.01)	0.09	1.01 (0.00)	0.00	6.02
	(P) Bayesian	1.63	0.88	90.36 (−0.02)	0.07	1.01 (0.00)	0.00	1.79

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons by Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Kim, J.-I.; Kim, T. Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence. Sensors 2016, 16, 412. https://doi.org/10.3390/s16030412

AMA Style

Kim J-I, Kim T. Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence. Sensors. 2016; 16(3):412. https://doi.org/10.3390/s16030412

Chicago/Turabian Style

Kim, Jae-In, and Taejung Kim. 2016. "Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence" Sensors 16, no. 3: 412. https://doi.org/10.3390/s16030412

APA Style

Kim, J.-I., & Kim, T. (2016). Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence. Sensors, 16(3), 412. https://doi.org/10.3390/s16030412

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comparison of Computer Vision and Photogrammetric Approaches for Epipolar Resampling of Image Sequence

Abstract

1. Introduction

2. Epipolar Resampling in Computer Vision

3. Photogrammetric Epipolar Resampling

4. Datasets and Methodology

5. Results of Performance Evaluation

6. Conclusions

Acknowledgments

Author Contributions

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI