1. Introduction
Computer vision techniques for motion extraction are widely developed in a huge variety of applications, including motion tracking, motion compensation, image registration [
1], remote sensing [
2], biomedicine [
3], satellite imagery [
4] and vibration analysis [
5]. Within this scope, techniques of digital image correlation (DIC) are known to provide accurate results with a high computational efficiency, along with good robustness against noises. Various variants of DIC exist in the literature, for example, phase-only correlation (POC) [
2], upsampling cross-correlation (UCC) [
6], Fourier-based correlation [
1] and virtual image correlation [
7].
In this paper, a well-known image correlation technique for subpixel motion extraction is analytically investigated. Subpixel accuracy is particularly important for video-based structural health monitoring (SHM) [
5,
8] and for aerial or satellite imagery [
4]. In such applications, the region of interest may be represented by a small number of pixels in the captured images. It is then important to extract both multipixel and subpixel motions information from video images.
A large variety of correlation-based techniques for motion extraction with subpixel accuracy have been compared in [
1,
2]. The most widespread techniques are performed in a two-step process. At the pixel level, displacement is estimated by maximizing the cross-correlation between two images. To achieve subpixel accuracy, displacement estimation is then refined in the vicinity of the cross-correlation peak. Among such refinement methods, quadratic surface fitting (QSF) provides a good trade-off between accuracy and computational burden, particularly suitable for video-based SHM, as reported in [
9]. This method and its variant forms have also been investigated in [
1,
2,
10,
11,
12,
13,
14]. Its good performance has been confirmed by our own experiments.
The purpose of this paper is to mathematically analyze the QSF method. By means of counterexamples, it will be shown that, contrary to a widespread intuition, the quadratic surface fitted to the cross-correlation values in the  pixels neighborhood of the correlation peak does not always have a maximum, despite the fact that the maximum pixel-level cross-correlation value is located at the center of this  pixels neighborhood. This absence of maximum leads to a failure of the QSF method, which should determine the subpixel displacement by maximizing the fitted quadratic surface interpolating the cross-correlation values in the vicinity of the cross-correlation peak. However, experiences reported by different authors and conducted by ourselves show that usually the QSF method produces satisfactory results. Then it is important to understand the conditions under which this method works correctly. In this paper, the conditions ensuring the existence of a maximum of the fitted quadratic surface will be formally analyzed. Then these conditions will be completed to make sure that the maximum is within the one-pixel vicinity of the pixel-level cross-correlation peak. Solutions will also be proposed to handle the failures cases of the QSF method by constrained optimization ensuring that the estimated subpixel displacement is within one pixel. These modifications apply only when a failure occurs, hence the extra numerical computation cost is insignificant.
This paper is an extended version of the conference paper [
15]. It is organized as follows. The considered problem is formulated in 
Section 2. The  QSF method is recalled in 
Section 3. Examples showing failures of the QSF method are presented in 
Section 4. The QSF method is then analyzed in 
Section 5. Handling of the failure cases is proposed in 
Section 6. Experimental results based on two typical types of images are reported in 
Section 7. Finally, conclusions are drawn in 
Section 8.
  2. Problem Statement
A locally rigid moving object is observed with a camera. It is assumed that the displacement of the observed object is small between successive images, with negligible rotation and negligible motion along the optical axis direction. Correlation processing will focus on a rectangular template, also known as region of interest (ROI), which includes either the whole moving object or some part of the object. The intensity of each pixel at instant t is denoted by  where the integer pair  indicates the position of the pixel in an image.
Instead of the usual row and column indexes, in this paper the pair 
 denotes pixel integer coordinates in a Cartesian system, as illustrated in 
Figure 1. It will serve both to describe pixel positions in an image and to fit quadratic surfaces in the QSF method. In the second usage, the origin 
 corresponds to zero subpixel displacement. This notation choice is more suitable for the surface fitting problem formulated in the Cartesian coordinate system, in agreement with usual mathematical notations.
Motion extraction will be carried out by determining the horizontal and vertical shifts of the selected template between two images captured at instants 
t and 
. The horizontal image shift (in number of pixels) is decomposed into an integer part 
 and a fractional (subpixel) part 
x with 
, and similarly the vertical image shift is decomposed into an integer part 
 and a fractional part 
y with 
, so that:
      
Given two images (frames) captured at time instants 
t and 
, as illustrated in 
Figure 2, template shifts are usually estimated through a two-step process [
9]. The pixel level (integer) shifts 
 are first estimated by maximizing the cross-correlation between the two image templates. At the second step, the  cross-correlation is somehow interpolated in the vicinity of the cross-correlation peak to estimate the subpixel shifts 
.
At the pixel level, let the cross-correlation be denoted by
      
      where 
 denotes the set of integer pairs 
 corresponding to the pixels belonging to the considered template. The dependences on 
 and on 
 are omitted in the notation 
 for a lighter presentation. The search for the cross-correlation peak is formulated as:
      where 
 and 
 are two positive integers specifying the search ranges respectively for horizontal and vertical shifts.
In order to gain subpixel accuracy, at the second step, the  cross-correlation 
 is somehow interpolated for non integer shifts so that the correlation maximization (
3) can be generalized to subpixel shifts.
In the QSF method, this interpolation is made by fitting, in the least squares sense, a second degree polynomial (or, geometrically, a quadratic surface) to the value of 
 and to the 8 neighboring cross-correlation values 
, namely 
 with 
. Then the maximum of the fitted polynomial yields the estimated subpixel shifts between the two templates [
9].
More formally, the integer shifts 
 being already estimated, let 
 denote the second degree polynomial fitted, in the least squares sense, to 
 for 
. Then, the subpixel shifts are estimated as
      
      and the estimated total shifts amount to:
Satisfactory experimental results of this method have been reported by different authors, for example, [
1,
2,
9]. Our own experiments also confirm its good performance compared to other existing methods for subpixel shift estimation. The main purpose of this paper is to consolidate the theoretical basis of this method.
More specifically, the QSF method, as  recalled above, assumes implicitly that the second degree polynomial 
 fitted to the nine correlation values 
, with 
, always has a unique global maximum, corresponding to 
 located in the one-pixel vicinity of 
, so that the total shifts as expressed in (
5) do not fall too far from the pixel level optimal shifts 
. This paper will investigate the following issues.
- 1.
- Does the quadratic surface fitted in the QSF method always have a maximum in the one-pixel vicinity of the pixel-level cross-correlation peak? 
- 2.
- If the answer to the first question is no, what are the conditions ensuring that the fitted quadratic surface has a maximum, and moreover, the maximum is located in the one-pixel vicinity of the pixel-level cross-correlation peak? 
- 3.
- What should the algorithm do if the fitted quadratic surface ever has no maximum, or if its maximum is outside the one-pixel vicinity of the pixel-level cross-correlation peak? 
  3. Quadratic Surface Fitting for Subpixel Refinement
The QSF method is recalled in this section before its analysis in the next sections.
Let 
 be resulting from the pixel level maximization (
3).
The nine integer pairs 
, with  
, form a 
 grid (The grid 
 is formed by nine integer pairs organized in three rows and three columns. It is also seen as a set with the integer pairs as elements, so that notations like 
 can be used.):
Accordingly, the nine cross-correlation values 
 normalized by the maximum cross-correlation value 
 and denoted by:
      form a matrix
      
The central entry of 
      is the maximum cross-correlation value 
 normalized by itself, hence this central entry 
 is the maximum value among all the nine entries of 
.
The second degree polynomial:
      with the vector 
 collecting the scalar coefficients 
, is then fitted to the entry values of 
 for 
, by solving the least squares problem:
      where 
 is the grid defined in (
6).
Does this fitted second degree polynomial  always have a unique global maximum?
As explained above, the central entry of the matrix , namely , is the maximum value among all the nine entries of . It then seems reasonable to expect that the second degree polynomial (or quadratic surface)  fitted to the nine entries of  has a maximum somehow close to the (maximum) central entry of the matrix , which corresponds to the origin  of the coordinate system characterizing the fitted quadratic surface.
Unfortunately, the fact that the central entry  is the maximum value among the nine entries of  does not really ensure that the fitted second degree polynomial  always has a global maximum, as demonstrated by the following counterexamples.
  4. Counterexamples
Three examples with either synthetic or real-world images are presented below to show that the quadratic surface fitted in the QSF method does not always have a maximum in the close vicinity of the pixel-level cross-correlation peak.
  4.1. Example 1
The first counterexample with synthetic images was chosen for its simplicity so that it can be easily reproduced. The Matlab code for generating the presented result is available for download [
16]. To show the robustness of this counterexample, the Matlab code can be optionally run with some random noises added to the generated synthetic images, though the result presented below is noise-free. Counterexamples based on true images (as presented in Examples 2 and 3) are also included in [
16].
Consider two binary images of 
 pixels as shown in 
Figure 3, with the template (ROI) chosen as the red square window of 
 pixels in each image. The template in the first image contains a diagonal pattern. In the second image, this diagonal pattern is shifted by one pixel toward the right and also by one pixel toward the bottom. In these binary images, the  intensity is 1 at the darker pixels and 10 at the brighter pixels.
The normalized 
 cross-correlation matrix around the peak as defined in (
8) is:
Fitting the second degree polynomial 
 to 
 for 
 by solving the least squares problem (
11) yields the solution:
The corresponding quadratic surface 
 exhibits a saddle point, as illustrated in 
Figure 4. It has no global maximum, despite the fact that the central entry of the matrix 
 is its largest entry.
This counterexample clearly invalidates the widespread intuition that the fitted polynomial  in the QSF method always has a global maximum.
  4.2. Example 2
This example is based on true images of the moon, included in Matlab as a part of its image examples. The Matlab code generating the presented results are available for download [
16]. Two images of the moon with 
 pixels are shown in 
Figure 5. As in Example 1, with some chosen template (ROI), cross-correlations between the two images are computed, and  a quadratic surface is fitted to the normalized 
 cross-correlation matrix around the peak. In most situations, the fitted surface has a maximum close to the pixel-level cross-correlation peak, but unexpected cases do happen with some particular template choices.
For the chosen template illustrated by the red windows in 
Figure 5, the fitted quadratic surface is shown in 
Figure 6. The  surface exhibiting a saddle point has no maximum. This result can be reproduced with the Matlab code downloadable from [
16].
  4.3. Example 3
This example is based the same images of the moon as in Example 2, but with another choice of the template for cross-correlation computation, as illustrated by the red windows in 
Figure 7. The fitted quadratic surface shown in 
Figure 8 has a maximum located at
        
        as indicated by the cyan dot on the surface. This maximum is far outside the square area of the one-pixel vicinity of the pixel-level cross-correlation peak delimited by the red dotted lines on the bottom plane in 
Figure 8. As a subpixel refinement, the maximum of the fitted surface should satisfy 
 and 
, which is not the case in this example. The details of this result are available from [
16].
The examples presented in this section confirm that, in some rare situations, the quadratic surface fitted in the QSF method does not have a maximum, or has a maximum far away from the square area of the one-pixel vicinity of the pixel-level cross-correlation peak.
After the answer to the first question raised at the end of 
Section 2, the following sections will then answer the other two remaining questions.
  5. Conditions for the Existence of a Maximum
In elementary algebra, it is well known that the second degree polynomial 
 as expressed in (
10) has a unique global maximum if its Hessian matrix,
      
      is negative definite [
17]. However, this simply stated fact does not directly help us to understand how the normalized cross-correlation values 
 (those filling up the 
 matrix in (
8)) should be, so that the fitted 
 has a global maximum. Because the polynomial coefficients 
 are determined from the values of 
 by solving the least squares problem (
11), it is straightforward to express the negative definiteness condition of 
H in terms of 
. Then, in principle, the condition for the existence of a global maximum of 
 will be formulated in terms of the normalized cross-correlation values 
. Nevertheless, this approach results in sophisticated conditions, notably an inequality involving the determinant of 
H expressed in terms of 
. For a better understanding, the  result presented below will be formulated with simple and easily interpretable inequalities about the values 
 filling up 
. For instance, one of these simple inequalities states that the central entry 
 of 
 is its largest entry. As shown by the previously presented counterexamples, this condition alone is not sufficient. It is then completed by similar simple inequalities. 
Theorem 1. If the normalized cross-correlation values  filling up the matrix Γ satisfy:then the second degree polynomial  fitted to the entries of Γ by solving the least squares problem (11) has a unique global maximum.  Interpretation of the conditions of Theorem 1.  
- Inequalities ( 15- ): the central entry  -  has the largest value among all the 9 entries of  - . 
- Inequalities ( 16- ): the middle entry  -  is the largest entry of the top or the bottom row of  - . 
- Inequalities ( 17- ): the middle entry  -  is the largest entry of the right or the left column of  - .   
Proof of Theorem 1. In order to shorten lengthy equations and inequalities, let us introduce more compact notations for the normalized cross-correlation values 
 filling up the matrix 
 defined in (
8), so that 
 is rewritten as:
        
Remark that the letters 
 fill 
 first at the four corners, then the middles of side rows and columns, before finishing at the central entry.
With these compact notations, the least squares solution (
11) leads to
        
As already mentioned in this paper, the negative definiteness of the Hessian matrix 
H defined in (
14) ensures that the polynomial 
 has a global maximum. Based on Sylvester’s criterion (Usually Sylvester’s criterion [
18] is about the positive definiteness of a real symmetric (or complex Hermitian) matrix. It is trivial to translate this criterion to the case of negative definiteness; this negative definiteness will be checked through:
        
According to the inequalities assumed in (
16), 
e (or 
h, resp.) is the largest entry of the top (or bottom, resp.) row of 
, then
        
        and according to (
15), 
i is the largest entry of 
, then
        
These inequalities together with (
19d) immediately imply (
20).
It is more involved to check (
21). The inequalities in (
22) imply:
        
        then
        
Repeat the reasoning from (
25) to (
27) while interchanging the positions of 
 and 
:
        
        leading to
        
Combining (
27) and (
29) yields
        
This result expresses a relationship between the entries in the top row of the matrix 
. A similar reasoning then leads to the following relationship between the entries in the bottom row of 
:
        
According to the inequalities assumed in (
15), 
i is the largest entry of 
, then
        
Take the sums of the respective sides of the four inequalities in (
30)–(
32), then
        
This result then implies that 
 and 
, as expressed respectively in (
19d) and (
19e), satisfy:
        
        hence
        
Following the same approach, it is then similarly shown that
        
This last result can also be deduced from a certain “symmetry” between the formulae expressing 
 and 
 in (
19d) and (
19f).
It then follows from (
14), (
36) and (
37) that
        
Therefore, the two inequalities (
20) and (
21) ensuring the negative definiteness of 
H are successfully checked. It is then established that the second degree polynomial 
 has a unique global maximum.    □
 When the fitted polynomial  has a unique global maximum, it may happen that this maximum is far away from the origin  corresponding to the optimized integer shifts , outside the square area of the one-pixel vicinity of the cross-correlation peak. Such situations are not desirable, since the subpixel refinement should not modify the estimated shifts by more than one pixel. The following result ensures that the maximum of  stays inside the one-pixel vicinity, under easily interpretable conditions.
Theorem 2. If, in addition to the conditions of Theorem 1, the normalized cross-correlation values  filling up the matrix Γ satisfy, for all ,then the maximum of the fitted polynomial  is located at  such that  and .  Interpretation of the conditions of Theorem 2.  
The conditions inherited from Theorem 1 ensure that the middle entry in each row or column of 
 is the largest entry of the row or column, without imposing any “degree of symmetry”. For example, among the top row of 
 as expressed in (
8), inequalities formulated in (
16) ensure that 
 is the largest entry, but the ratio 
 can be any positive number. Two of the inequalities in the extra condition (
39) of Theorem 2 constrain this ratio between 
 and 5, thus limiting the dissymmetry between 
 and 
.  
Proof of Theorem 2. Based on Theorem 1 (its conditions are inherited here), the fitted polynomial 
 has a unique global maximum in 
, which is located at
        
In order to prove , it will be shown that the numerator of  is smaller than its denominator. The proof for proving  will be made similarly.
With the compact notations filling up the matrix 
 introduced in (
18) for the normalized correlation values 
, one of the inequalities contained in (
39), namely
        
        is translated into
        
        which is rewritten as
        
        or in a slightly different form
        
A similar reasoning (by interchanging the positions of 
 and 
) leads to
        
Combining (
48) and (
50) then amounts to
        
This last inequality concerns the entries in the top row of 
 as in (
18). Similar reasonings about the bottom row, the left and right columns of 
 then lead to
        
Because 
i is the largest entry of 
 (condition inherited from Theorem 1),
        
Summarizing (
51), (
52) and (
55) then implies that 
 as expressed in (
19d) satisfies:
        
This result, together with (
19e), then yields to:
        
Some similar reasonings (due to a certain “symmetry” between 
 and 
 in (19) then lead to
        
Add together the respective sides of (
19b) and (
19d), then,
        
Every parenthesis at the right hand side of (
61) is positive, due to conditions inherited from Theorem 1. Then,
        
Based on the last 2 inequalities, 
 satisfies:
        
        hence
        
In the same manner, combining (
19c) and (
19f) yields:
        
        then
        
The following steps of the proof will be essentially based on:
        
        respectively, due to (
59), (
60), (
65) and (
68).
On the one hand, (
71), (
72) and (
69) imply (the step leading to (
74) below)
        
        and on the other hand, (
69) and (
70) lead to (the step leading to (
78) below)
        
It is then concluded that 
 as expressed in (
41) satisfies
        
In the same way, it is also proved that:
        
The proof of Theorem 2 is thus completed.   □
   6. Handling Failures of the QSF Method
There are two possible failure cases:
- Case 1, the fitted polynomial  has no global maximum; 
- Case 2,  has a global maximum, reached at , but  and/or . 
In the first case, the only reasonable proposition is to retain the optimized integer shifts  as the estimated total shifts.
In the second case, solutions on the boundary satisfying 
 and 
 are accepted. If 
 and/or 
, then the subpixel shifts are estimated by solving the constrained optimization problem:
      and the estimated total shifts amount to:
As 
 is a second degree polynomial, the constrained optimization problem (
83) can be solved by quadratic programming algorithms. In this considered Case 2, the unique unconstrained maximum of 
 is outside the square area constrained by 
 and 
, hence the constrained solution is certainly on the boundary of the constrained square area, on one of its sides or on one of its corners. Given the simplicity of this quadratic problem, instead of applying a general quadratic programming tool, the constrained optimization problem (
83) can be solved as follows:
- Compute the values of  at the four corners of the square, namely ; 
- Find the maximums of  -  in  x-  and  -  in  y- :
           
- Eliminate any result(s) not satisfying  or . 
- Find the maximum value among the four corner values  -  and the non-eliminated values  -  and/or  - , if any. The solution of the constrained optimization problem ( 83- ) is then given by the location of this maximum. 
  7. Assessment Based on Two Typical Types of Images
The original QSF method and the modified method are tested with two typical types of images for failure rate evaluation. As already stated in the introduction, the original QSF method works correctly in most situations. The occurrence frequency of its failures depends on the type of processed images. Two typical examples are presented below in this section.
For the moon example already considered in 
Section 4, when the template window is close to a corner of the left side of the images, as shown in 
Figure 5, the selected region of interest has a diagonally dominant pattern, sharing a common characteristic with the synthetic example shown in 
Figure 3. For this reason, the QSF method is more likely to encounter difficulties with these images.
A second example comes from the publicly available DIC Challenge database [
19,
20] provided by the Society for Experimental Mechanics. The tested images contain uniform patterns, as shown in 
Figure 9.
For each trial with the original QSF method, the template window is placed at a different position in the processed images, as illustrated in 
Figure 5 and 
Figure 9. Among all the trials, the number of cases where the fitted quadratic surface exhibits a saddle point (absence of maximum) is counted. The second type of failures, with maximum outside the one-pixel vicinity, is also counted. The results are summarized in 
Table 1.
As expected, the moon example leads to more failures (0.21% and 4.79% of the two types of failures over the total number of trials) due to the existence of diagonally dominant patterns. For the second type of failures, the maximum of the fitted surface can be far away from the the one-pixel vicinity, up to 52 pixels. On the other hand, the DIC Challenge example with uniform patterns has very few failures (8.74 ppm and 52.4 ppm).
Though failures of the original QSF method rarely happen, it is important to prevent them for reliable applications.
After modifications of the QSF method as proposed in 
Section 6, for each the trials with the above two examples, the modified method successfully finds a subpixel displacement 
 such that 
 and 
.
  8. Conclusions
In this paper, within the scope of digital image correlation techniques, some theoretical aspects of the QSF method have been investigated. It has been shown that, contrary to a widespread intuition, the quadratic surface fitted in the QSF method does not always have a maximum. Then, for a better understanding of this method, it is analyzed by providing mathematical conditions ensuring expected results. Algorithm modifications have also been proposed to handle unexpected cases. Finally, experimental results based on two typical types of images have been reported. These results will contribute to consolidating both the theory and the practice of the QSF method.