Satellite-Borne Optical Remote Sensing Image Registration Based on Point Features

Since technologies in image fusion, image splicing, and target recognition have developed rapidly, as the basis of many image applications, the performance of image registration directly affects subsequent work. In this work, for rich features of satellite-borne optical imagery such as panchromatic and multispectral images, the Harris corner algorithm is combined with the scale invariant feature transform (SIFT) operator for feature point extraction. Our rough matching strategy uses the K-D (K-Dimensional) tree combined with the BBF (Best Bin First) method, and the similarity measure is the nearest neighbor/the second-nearest neighbor ratio. Finally, a triangle-area representation (TAR) algorithm is utilized to eliminate false matches in order to ensure registration accuracy. The performance of the proposed algorithm is compared with existing popular algorithms. The experimental results indicate that for visible light and multi-spectral satellite remote sensing images of different sizes and different sources, the proposed algorithm in this work is excellent in accuracy and efficiency.


Introduction
The specific objective of image registration is to find the geometric correspondence between different images that contain the same contents. It uses an accurate model to describe the internal relationship among pixels in two images and accurately match them together. Hence, a unified model is essential to find the commonalities of features in different images [1,2]. Image registration frequently appears in research works of scholars, and it possesses important value in multi-temporal and large-scale application scenarios. In the field of remote sensing, there are many types of satellite-borne optical image sensors, and their acquired remote sensing images are of different resolutions and bands. Though the accomplishment of registration, mosaicking and fusion among these images, more integral and abundant data can be generated, which can lay the foundation for subsequent work, such as change detection and scene expansion.
Recently, with the continuous development of sensor technology, researchers in the satellite remote sensing field have paid more attention to the study of high-speed and stable information transmission [3][4][5][6][7]. As sensor types become more and more diverse, remote sensing images with different spatial and spectral resolutions can be obtained using different satellite-borne platforms. At present, there are many types of observation satellites, including Landsat series and SPOT (Systeme Probatoire d'Observation de la Terre) series, as well as Chinese Gaofen (GF) series and ASTER (Advanced Spaceborne Thermal Emission and Reflection Radiometer). This work is mainly focused on the registration of satellite-borne optical remote sensing imagery, i.e., panchromatic and multispectral images. registration of satellite-borne optical remote sensing imagery, i.e., panchromatic and multispectral images.
The remainder of this article is organized as follows. The second section introduces the principle of the algorithm, including the Harris corner point algorithm, scale invariant feature transform (SIFT) algorithm, Best Bin First (BBF) algorithm and nearest/near neighbor ratio method. Then, the basic principle of the triangle-area representation (TAR) algorithm is described in detail, which is adopted to eliminate false matches in rough matching and realize fine registration. Meanwhile, the optimal affine transform parameters are obtained for the matched points. In the third section, the proposed algorithm in this work is evaluated by using the images from GF-1, GF-2 and ASTER. Finally, conclusions and discussions are given in the fourth section.

Remote Sensing Image Matching
This work uses the Harris algorithm to extract feature points from reference images and images to be registered, and then uses the SIFT operator to describe the feature points. The rough matching strategy is relied on the K-D (K-Dimensional) tree combined with the BBF algorithm, and the similarity measure is the first/second nearest neighbor ratio method. Finally, a TAR-based algorithm is used to eliminate false matches' points in order to precisely match the image. Our algorithm procedure is demonstrated in Figure 1 as follows:

Remote Sensing Image Preprocessing
There are many factors that have an impact on optical remote sensing image quality. A major disadvantage is thermal noise or interference from other factors in the imaging processes. Another disadvantage is that the encoding mode of some image systems sacrifices grayscale representation to some degree, in order to achieve high compression ratios, which makes gray differences of pixels in adjacent regions smaller [8] and reduces the gradients of gray value between objects and backgrounds. Consequently, feature extraction for registration becomes difficult and it is necessary for these images to be filtered or enhanced, so as to improve their quality before deep processing [9][10][11].
For the purpose of improving the variation range of gray scale and contrast, linear stretching is adopted in this work. Generally, a linear stretch of 0.02 can achieve acceptable visual appearances in remote sensing images. That is, the distribution of an image histogram between 2% and 98% is linearly stretched to extend dynamic range of pixels to its whole gray space, so that the whole image has more abundant gray information. Figure 2 is an enhanced example of 0.02 linear stretching; the processed image has stronger contrast, better visual appearance, and is more beneficial in terms of subsequent feature extraction than the original one.

Remote Sensing Image Preprocessing
There are many factors that have an impact on optical remote sensing image quality. A major disadvantage is thermal noise or interference from other factors in the imaging processes. Another disadvantage is that the encoding mode of some image systems sacrifices grayscale representation to some degree, in order to achieve high compression ratios, which makes gray differences of pixels in adjacent regions smaller [8] and reduces the gradients of gray value between objects and backgrounds. Consequently, feature extraction for registration becomes difficult and it is necessary for these images to be filtered or enhanced, so as to improve their quality before deep processing [9][10][11].
For the purpose of improving the variation range of gray scale and contrast, linear stretching is adopted in this work. Generally, a linear stretch of 0.02 can achieve acceptable visual appearances in remote sensing images. That is, the distribution of an image histogram between 2% and 98% is linearly stretched to extend dynamic range of pixels to its whole gray space, so that the whole image has more abundant gray information. Figure 2 is an enhanced example of 0.02 linear stretching; the processed image has stronger contrast, better visual appearance, and is more beneficial in terms of subsequent feature extraction than the original one.

Harris Feature Points Extraction
The Harris corner detection and extraction algorithm can build a rectangular window of a certain size to test every pixel in an image. The testing content is the average energy of the pixels in the window, which serves as the metric for judging whether a pixel is a corner point. Namely, when the average energy of the point in the window is greater than a preset threshold, it can be regarded as a feature point [12].

SIFT Feature Point Description
Since it does not involve the construction of multi-scale space, the time complexity of the Harris operator is rather low. It has good robustness to illumination, scale, rotation and angle transformation [13], but its detection performance for smooth images is not satisfactory [14]. In spite of its higher time complexity, the SIFT operator can capture more feature points. Taking the main characteristics of optical remote sensing images into consideration, we use the Harris algorithm to extract feature points and apply the SIFT descriptor to describing the feature points in order to satisfy both accuracy and time requirements [15].
Generally, the SIFT algorithm consists of two parts: determining feature points of an image and describing feature points [16,17]. The process of image feature points determination is similar to the perception of point information by human vision. Usually, regardless of optical image resolution, human eyes can always distinguish valid features [18]. Hence, stable feature points of images are detected at different scales, and some information obtained in the detection process is utilized to construct a multi-dimensional description symbol to arrange the Harris feature points. In the term of feature point extraction, the SIFT operator is robust to scaling, rotation and transforms of images, and it is also resistant to external factors such as illumination and noise [19].

BBF Search Strategy
The K-D tree, is essentially a binary tree structure. In this search space, data are usually divided into a binary tree structure according to spatial positions, and then search processes are also carried out according to the search rules of binary trees. The kernel idea of the K-D tree is to divide data into left and right uniform structures from top to bottom in a two-dimensional space. As a result, all data are split into a left subtree and a right subtree according to their spatial position, and then the same operation is repeated on the subtrees to part them into smaller subtrees until the data cannot be subdivided. In the process of dividing, it is vital to maintain the data balance between the

Harris Feature Points Extraction
The Harris corner detection and extraction algorithm can build a rectangular window of a certain size to test every pixel in an image. The testing content is the average energy of the pixels in the window, which serves as the metric for judging whether a pixel is a corner point. Namely, when the average energy of the point in the window is greater than a preset threshold, it can be regarded as a feature point [12].

SIFT Feature Point Description
Since it does not involve the construction of multi-scale space, the time complexity of the Harris operator is rather low. It has good robustness to illumination, scale, rotation and angle transformation [13], but its detection performance for smooth images is not satisfactory [14]. In spite of its higher time complexity, the SIFT operator can capture more feature points. Taking the main characteristics of optical remote sensing images into consideration, we use the Harris algorithm to extract feature points and apply the SIFT descriptor to describing the feature points in order to satisfy both accuracy and time requirements [15].
Generally, the SIFT algorithm consists of two parts: determining feature points of an image and describing feature points [16,17]. The process of image feature points determination is similar to the perception of point information by human vision. Usually, regardless of optical image resolution, human eyes can always distinguish valid features [18]. Hence, stable feature points of images are detected at different scales, and some information obtained in the detection process is utilized to construct a multi-dimensional description symbol to arrange the Harris feature points. In the term of feature point extraction, the SIFT operator is robust to scaling, rotation and transforms of images, and it is also resistant to external factors such as illumination and noise [19].

BBF Search Strategy
The K-D tree, is essentially a binary tree structure. In this search space, data are usually divided into a binary tree structure according to spatial positions, and then search processes are also carried out according to the search rules of binary trees. The kernel idea of the K-D tree is to divide data into left and right uniform structures from top to bottom in a two-dimensional space. As a result, all data are split into a left subtree and a right subtree according to their spatial position, and then the same operation is repeated on the subtrees to part them into smaller subtrees until the data cannot be subdivided. In the process of dividing, it is vital to maintain the data balance between the left and right subtrees as much as possible. Otherwise, search efficiency may decline. The Best Bin First (BBF) search algorithm is a search algorithm developed for K-D tree structures. It outperforms the K-D tree search algorithm in terms of processing high-dimensional features [20]. It pushes points that can be traced into a sequence, and sorts these points, according to their distances from a hyperplane. The closest point has the highest priority, and then all the points in the queue are traversed according to their priority until the queue becomes empty. In addition, BBF also imposes restrictions on its search time. That is, once its running time exceeds the pre-set time, the algorithm will directly output the current closest point as the result. Therefore, we chose the K-D tree to organize the feature points while using the BBF algorithm to search for the feature points in this work.

Similarity Measure
The first/second nearest neighbor ratio method is chosen as the similarity metric in registration for reducing the complexity of calculation [21]. First of all, the point in an image to be registered that is closest to a search point in its reference image needs to be found, and their distance is denoted as Dis1. Then, it is necessary to find the next nearest point to the search point, and their distance is denoted as Dis2. Then, when comparing Dis1/Dis2 to a given threshold, and when the ratio is less than the threshold, the point pair can be considered to be a possible real matching pair.

Fine Matching Strategy
Since satellite-borne optical remote sensing images are acquired at high altitude, the view differences among images of the same target are slight. Changes between images only involve transforms such as translation, scaling, and rotation. Therefore, affine transform models can satisfy geometric transform requirements in our cases [22,23]. Therefore, an affine transform based on Triangle-area representation (TAR) is utilized in fine matching [24].
TAR is a framework in which features at every scale, i.e., edge lengths of a triangle, are normalized locally according to their scales. Among shape attributes at different scales, local normalization features are more distinct and can more accurately describe shapes in remote sensing images. Unlike some other matching methods that use a limited number of boundary points (such as corners or knee points), TAR is an exhaustive method for all boundary points.
TAR value is calculated from a triangle region formed by points on the shape boundary. Each contour point is represented by its coordinate (x, y), and a discrete parameter sequence (x n , y n ) (n = 1, . . . , N) represents the N points obtained by resampling a shape. Then, the curvature of each point is represented using the TAR value defined below. For three sequential points, (x n−t s , y n−t s ), (x n , y n ) and (x n+t s , y n+t s ), where n ∈ [1, N], t s ∈ [1, T s ] represents arbitrary edge length of a triangle, and T s is the longest distance between any two sample points. Thus, the TAR value formed by these points can be expressed as: x n−t s y n−t s 1 x n y n 1 x n+t s y n+t s 1 (1) Given that running counterclockwise, the TAR value is positive when the local contour denoted by three sample points is convex. The TAR value is negative when the contour is concave. And the TAR value is 0 when the contour is straight. Figure 3 depicts triangular regions at different positions of a closed contour.  The above figure is a closed hammer-shaped contour. Region 1 is a convex shape, and its TAR value is greater than 0. Region 2 is a concave shape, and its TAR value is less than 0. Region 3 is a straight line. The curvature function of the TAR value of discrete points can be rewritten as: where TAR(n, n + 1) is the TAR value at ts = 1, and 2 2 n s n n d x y = +   corresponds to the first edge length of a triangle, that is, the distance between the first and second vectors of the triangle formed by points (xn, yn), (xn+1,yn+1), and (xn+2, yn+2). This equation clearly expresses the relationship between the curvature and TAR value of a shape. It is known that zero crossings of a curvature function are invariant under a general affine transformation [26], and points with non-zero curvature are also not invariant to the affine transformation [27]. Thus, considering the contour sequences xn and yn of a two-dimensional shape, if an affine transform is performed, the relationship between the original sequences and the transformed sequences is: where ˆn x and ˆn y are the transformed sequences after the affine transform, b1 and b2 are translation parameters, and a1, a2, a3 and a4 are scaling and rotation factors. The influence of translation parameters is easily eliminated by normalizing the shape boundary corresponding to a centroid. This normalization is accomplished by subtracting the average value from each boundary sequence. Substituting Equation (3) into Equation (1), we can obtain: , , s s TAR n t a a a a TAR n t = − where TÂ R is TAR value after the affine transformation. It is obvious that TÂ R is invariant for affine transformation. The above figure is a closed hammer-shaped contour. Region 1 is a convex shape, and its TAR value is greater than 0. Region 2 is a concave shape, and its TAR value is less than 0. Region 3 is a straight line. The curvature function of the TAR value of discrete points can be rewritten as: where TAR(n, n + 1) is the TAR value at t s = 1, and d s n = .
x 2 n + . y 2 n corresponds to the first edge length of a triangle, that is, the distance between the first and second vectors of the triangle formed by points (x n , y n ), (x n+1 , y n+1 ), and (x n+2 , y n+2 ). This equation clearly expresses the relationship between the curvature and TAR value of a shape. It is known that zero crossings of a curvature function are invariant under a general affine transformation [26], and points with non-zero curvature are also not invariant to the affine transformation [27]. Thus, considering the contour sequences x n and y n of a twodimensional shape, if an affine transform is performed, the relationship between the original sequences and the transformed sequences is: wherex n andŷ n are the transformed sequences after the affine transform, b 1 and b 2 are translation parameters, and a 1 , a 2 , a 3 and a 4 are scaling and rotation factors. The influence of translation parameters is easily eliminated by normalizing the shape boundary corresponding to a centroid. This normalization is accomplished by subtracting the average value from each boundary sequence. Substituting Equation (3) into Equation (1), we can obtain: TÂR(n, t s ) = (a 1 a 4 − a 2 a 3 )TAR(n, t s ) where TÂR is TAR value after the affine transformation. It is obvious that TÂR is invariant for affine transformation.

Elimilation of Fales Matches
The difficulty of feature points matching lies in noise, intensive affine transformation, etc. For instance, some feature points are shifted from their original positions and become abnormal points in an image. Nowadays, there are many point matching algorithms, most of which are based on the similarity of local features, spatial relationships, or both. In some existing algorithms, affine invariant operators are utilized to detect whether a matching point is an abnormal point, through global information [28]. For example, the RANSAC (Random Sample Consensus) algorithm establishes a model for the correspondence between point pairs to estimate transform parameters. If false matches are not more than 50%, the algorithm can eliminate them effectively [29]. In this work, we use the affine invariance of TAR to eliminate false matches. The procedure consists of three steps: constructing KNN-TAR (K-Nearest Neighbor-Triangle-Area Representation) operators [30], processing candidate outliers and removing false matches.
Most of the outliers can be found by KNN-TAR, but a few outliers have the same nearest neighbors. The removal of such outliers is very important, and it directly effects the registration performance of the proposed algorithm. The outlier removal in this study includes three parts: the KNN-TAR descriptor, the process of the candidate outliers, and the removal of the remaining outliers. That is: 1.
Constructing KNN-TAR operators. Supposing that the nearest neighbors of the outliers have more structural dissimilarity, the TAR value is used to construct an affine invariant variable, which is calculated by the K nearest neighbor (KNN) in order to find outliers.

2.
Dealing with candidate outliers. Whether the suspected outliers sifted by KNN-TAR are real false matches is determined by the local structure of the single matching pair and the global transform error.

3.
Removing false matches. Adjust the parameter setting of KNN-TAR, so as to eliminate the outliers with the same KNN.

Experimental Results
The setup of the hardware environment requires an Intel core i5-4570 processor at 3.20 Hz, with 4.00 GB RAM. The operating system is 64-bit Windows 7, and our programming software is MATLAB R2014a. Here, we compare the proposed algorithm with other existing algorithms using four pairs of satellite-borne optical remote sensing images.
The first pair of experimental images is acquired by the GF-1 Panchromatic and Multispectral sensor (PMS) at Mao County, which is a multispectral image with 3000 × 1012 pixels. The preprocessed images are shown in Figure 4.

Elimilation of Fales Matches
The difficulty of feature points matching lies in noise, intensive affine transformation, etc. For instance, some feature points are shifted from their original positions and become abnormal points in an image. Nowadays, there are many point matching algorithms, most of which are based on the similarity of local features, spatial relationships, or both. In some existing algorithms, affine invariant operators are utilized to detect whether a matching point is an abnormal point, through global information [28]. For example, the RANSAC (Random Sample Consensus) algorithm establishes a model for the correspondence between point pairs to estimate transform parameters. If false matches are not more than 50%, the algorithm can eliminate them effectively [29]. In this work, we use the affine invariance of TAR to eliminate false matches. The procedure consists of three steps: constructing KNN-TAR (K-Nearest Neighbor-Triangle-Area Representation) operators [30], processing candidate outliers and removing false matches.
Most of the outliers can be found by KNN-TAR, but a few outliers have the same nearest neighbors. The removal of such outliers is very important, and it directly effects the registration performance of the proposed algorithm. The outlier removal in this study includes three parts: the KNN-TAR descriptor, the process of the candidate outliers, and the removal of the remaining outliers. That is: 1. Constructing KNN-TAR operators. Supposing that the nearest neighbors of the outliers have more structural dissimilarity, the TAR value is used to construct an affine invariant variable, which is calculated by the K nearest neighbor (KNN) in order to find outliers.
2. Dealing with candidate outliers. Whether the suspected outliers sifted by KNN-TAR are real false matches is determined by the local structure of the single matching pair and the global transform error.
3. Removing false matches. Adjust the parameter setting of KNN-TAR, so as to eliminate the outliers with the same KNN.

Experimental Results
The setup of the hardware environment requires an Intel core i5-4570 processor at 3.20 Hz, with 4.00 GB RAM. The operating system is 64-bit Windows 7, and our programming software is MATLAB R2014a. Here, we compare the proposed algorithm with other existing algorithms using four pairs of satellite-borne optical remote sensing images.
The first pair of experimental images is acquired by the GF-1 Panchromatic and Multispectral sensor (PMS) at Mao County, which is a multispectral image with 3000 × 1012 pixels. The preprocessed images are shown in Figure 4.     By rough matching, 251 pairs of points are obtained. The straight lines in Figure 6 connect the corresponding points in the two images. It is obvious that there are some crossing lines between the two images, that is, there are obvious false matches. After the RANSAC algorithm eliminated one point pair, the obtained affine transform matrix for registration is presented in Equation (5) After 16 point pairs were eliminated by the proposed KNN-TAR algorithm, the obtained affine transform matrix for registration is given in Equation (6) Figure 7 exhibits the GF-1 image pairs after eliminating false matches using the two methods. The registration results for GF-1 are listed in Table 1. The RMSE (Root Mean Square Error) of our algorithm is 0.8619 pixels, which is better than that of the RANSAC algorithm.  By rough matching, 251 pairs of points are obtained. The straight lines in Figure 6 connect the corresponding points in the two images. It is obvious that there are some crossing lines between the two images, that is, there are obvious false matches. After the RANSAC algorithm eliminated one point pair, the obtained affine transform matrix for registration is presented in Equation (5) After 16 point pairs were eliminated by the proposed KNN-TAR algorithm, the obtained affine transform matrix for registration is given in Equation (6) Figure 7 exhibits the GF-1 image pairs after eliminating false matches using the two methods. The registration results for GF-1 are listed in Table 1. The RMSE (Root Mean Square Error) of our algorithm is 0.8619 pixels, which is better than that of the RANSAC algorithm. By rough matching, 251 pairs of points are obtained. The straight lines in Figure 6 connect the corresponding points in the two images. It is obvious that there are some crossing lines between the two images, that is, there are obvious false matches. After the RANSAC algorithm eliminated one point pair, the obtained affine transform matrix for registration is presented in Equation (5).
After 16 point pairs were eliminated by the proposed KNN-TAR algorithm, the obtained affine transform matrix for registration is given in Equation (6).  Figure 7 exhibits the GF-1 image pairs after eliminating false matches using the two methods. The registration results for GF-1 are listed in Table 1. The RMSE (Root Mean Square Error) of our algorithm is 0.8619 pixels, which is better than that of the RANSAC algorithm.   Moreover, according to the obtained transformation parameters, the bilinear interpolation method is utilized to realize image mosaicking. The final stitched GF-1 image is shown in Figure 8. The second pair of experimental images was acquired by the GF-2 PMS sensor at Mao County, which is a multispectral image with 2400 × 1800 pixels. The preprocessed and stitched GF-2 images are shown in Figure 9. The registration results for GF-2 are listed in Table 2. The RMSE of our algorithm is 5.7423 pixels, which is also better than that of RANSAC. Moreover, according to the obtained transformation parameters, the bilinear interpolation method is utilized to realize image mosaicking. The final stitched GF-1 image is shown in Figure 8.  Moreover, according to the obtained transformation parameters, the bilinear interpolation method is utilized to realize image mosaicking. The final stitched GF-1 image is shown in Figure 8. The second pair of experimental images was acquired by the GF-2 PMS sensor at Mao County, which is a multispectral image with 2400 × 1800 pixels. The preprocessed and stitched GF-2 images are shown in Figure 9. The registration results for GF-2 are listed in Table 2. The RMSE of our algorithm is 5.7423 pixels, which is also better than that of RANSAC. The second pair of experimental images was acquired by the GF-2 PMS sensor at Mao County, which is a multispectral image with 2400 × 1800 pixels. The preprocessed and stitched GF-2 images are shown in Figure 9. The registration results for GF-2 are listed in Table 2. The RMSE of our algorithm is 5.7423 pixels, which is also better than that of RANSAC.   The third pair of experimental images are acquired by the visible/near infrared part of the ASTER sensor, with spatial resolutions of 15 m. The preprocessed and stitched ASTER images are shown in Figure 10 and they are pseudo-color synthetic images. The registration results for ASTER are listed in Table 3. The RMSE of our algorithm is 0.5362 pixels, which is also better than that of the RANSAC algorithm. The third pair of experimental images are acquired by the visible/near infrared part of the ASTER sensor, with spatial resolutions of 15 m. The preprocessed and stitched ASTER images are shown in Figure 10 and they are pseudo-color synthetic images. The registration results for ASTER are listed in Table 3. The RMSE of our algorithm is 0.5362 pixels, which is also better than that of the RANSAC algorithm.  The third pair of experimental images are acquired by the visible/near infrared part of the ASTER sensor, with spatial resolutions of 15 m. The preprocessed and stitched ASTER images are shown in Figure 10 and they are pseudo-color synthetic images. The registration results for ASTER are listed in Table 3. The RMSE of our algorithm is 0.5362 pixels, which is also better than that of the RANSAC algorithm.  The fourth pair of experimental images is from GF-1 and GF-2. The preprocessed and stitched images are shown in Figure 11. The registration results for ASTER are listed in Table 4. The RMSE of our algorithm is 0.5362 pixels, which is also better than that of the RANSAC algorithm.  The fourth pair of experimental images is from GF-1 and GF-2. The preprocessed and stitched images are shown in Figure 11. The registration results for ASTER are listed in Table 4. The RMSE of our algorithm is 0.5362 pixels, which is also better than that of the RANSAC algorithm.  From the above experimental results, it is found that as the number of rough matching points increases, there are slight increases in the matching time of the KNN-TAR algorithm compared to that of the RANSAC algorithm. However, as for satellite-borne optical remote sensing images from different sources, the proposed algorithm in this work can eliminate false matches and realize accurate registration. Through comparative experiments, it can be proven that its registration accuracy is better than that of the RANSAC algorithm.
Moreover, we also compare the proposed method with three popular image registration methods, SIFT, SURF (Speed-Up Robust Features) and ORB (Oriented FAST and Rotated BRIEF). They also combined the BBF with RANSAC algorithm in feature  From the above experimental results, it is found that as the number of rough matching points increases, there are slight increases in the matching time of the KNN-TAR algorithm compared to that of the RANSAC algorithm. However, as for satellite-borne optical remote sensing images from different sources, the proposed algorithm in this work can eliminate false matches and realize accurate registration. Through comparative experiments, it can be proven that its registration accuracy is better than that of the RANSAC algorithm.
Moreover, we also compare the proposed method with three popular image registration methods, SIFT, SURF (Speed-Up Robust Features) and ORB (Oriented FAST and Rotated BRIEF). They also combined the BBF with RANSAC algorithm in feature point matching, and the SIFT algorithm still uses the Harris operator for feature extraction instead of global matching in the original version. The comparison results are also presented in Table 5. As revealed in Table 5, the proposed registration method outperforms the other methods for GF-1, ASTER and Multi-view images. Even for the GF-2 image, its registration result is also acceptable.

Conclusions
Because of small view differences of satellite remote sensing images, a registration method based on point features is designed in this study. The Harris operator, with its fast detection speed, is chosen to extract image features, and the SIFT operator is used to describe the features in order to ensure accuracy. After that, the BBF algorithm combined with the first/second-nearest neighbor method is adopted to realize rough matching of feature points. Then, a TAR method is introduced into false match elimination in order to enhance matching accuracy. The experimental results indicate that the method used in this work has better registration accuracy compared with RANSAC and some other existing registration algorithms. However, due to the combination of different optimization methods, the proposed algorithm has no significant advantages in time efficiency. With the increase in image resolution and size, registration time will increase. Therefore, other effective optimization algorithms may be utilized to accelerate parameter fitting processes to improve the speed of registration. Our proposed strategy is only effective in twodimensional space, and cannot perform well for images with many view differences. In the future, it can be modified to adapt to three-dimensional space.

Informed Consent Statement: Not applicable.
Data Availability Statement: The download website of the ASTER data images used in this work is https://e4ftl01.cr.usgs.gov/, accessed on 10 April 2021. GF-1 and GF-2 images are available at the following link: http://www.cresda.com/CN/sjfw/zxsj/index.shtml, accessed on 10 April 2021.