Slope-Restricted Multi-Scale Feature Matching for Geostationary Satellite Remote Sensing Images

For geostationary meteorological satellite (GSMS) remote sensing image registration, high computational cost and matching error are the two main challenging problems. To address these issues, this paper proposes a novel algorithm named slope-restricted multi-scale feature matching. In multi-scale feature matching, images are subsampled to different scales. From a small scale to a large scale, the offsets between the matched pairs are used to narrow the searching area of feature matching for the next larger scale. Thus, the feature matching is accomplished from coarse to fine, which will make the matching process more accurate and reduce errors. To enhance the matching performance, the outliers in the matched pairs are rectified by using slope-restricted rectification, which is based on local geometric similarity. Compared with other algorithms, the experimental results show that our proposed method is more accurate and efficient.


Introduction
Image registration is an inherent part of remote sensing image processing, since it has been widely applied in image fusion [1,2], image mosaic [3], change detection [4,5] and 3D reconstruction [6,7].In recent years, although geostationary meteorological satellites with higher spatial resolution and higher accuracy have been invented, study on registration for lower accuracy geostationary meteorological satellite (GSMS) images is still needed, especially for historical data re-processing (e.g., climate change analysis).The aim of image registration is to match two or more images of the same scene with different times, different sensors or different viewpoints.A number of methods have been proposed for remote sensing image registration.These methods can be coarsely classified into intensity-based and feature-based methods.Compared with intensity-based algorithms, feature-based algorithms have a good ability to handle image distortions and illumination changes, and reduce the computational cost.In this paper, we focus on the research of feature-based matching methods.
The scale-invariant feature transform (SIFT) algorithm [8], which was first proposed by Lowe, is one of the most popular feature descriptor algorithms for remote sensing image registration.With the attributes of rotation, scale invariance, and illumination change consistency, this method can match two images successfully even when there are significant differences in multi-sources, illumination and rotation.The SIFT algorithm is distinctive and robust, but it will be ineffective if there are many repeated or similar structures in the images.Many researchers extend the SIFT algorithm to improve the matching accuracy [9,10].Sedaghat et al. [11] proposed an improved SIFT algorithm called uniform robust scale invariant feature transform (UR-SIFT) to extract high-quality SIFT features in the uniform distribution of both scales and spaces.It can find the correspondence between image pairs even with scale differences of five times.Qin et al. [12] introduced an algorithm based on SIFT and CONTOURLET transform for remote sensing image registration.Fan et al. [13] used the spatial relationship of improved SIFT to refine the matched pairs.Sedaghat et al. [14] presented a novel descriptor based on UR-SIFT to search feature correspondences and eliminate the mismatches.Since there are few textures in GSMS images, few features can be extracted by SIFT.Some researchers focus on shape context [15] instead of SIFT for remote sensing image registration.Huang et al. [16] introduced shape context for synthetic aperture radar (SAR) and optical image registration.It captured the points distribution of a shape silhouette to develop the representations.Shi et al. [17] improved shape context with a spatial relationship to distinguish repetitive patterns.Gu et al. [18] employed polynomial fitting-based shape matching for multi-sensor remote sensing images.To get more matched pairs, Jiang et al. [19] adopted the shape context to construct the distribution of the initial matched pairs.Unfortunately, the shape context is incapable of GSMS image matching since there are too many monotonous patterns in it.Wavelet-based methods [20] transform images to a new domain and extract more essential features for image registration.However, features extracted from the high-frequency sub-bands are sensitive to translation effects.To deal with this problem, Le Moigne et al. [21] integrated steerable filters in the wavelet-based image registration.Besides, wavelets are known to be isotropic.Murphy et al. [22,23] made anisotropic generalization with shearlets and produced sparse, concentrated and directional features.These approaches obtained more robust registration than classical wave-based methods.In many situations, detecting features at the finest stable scale may not be appropriate [24].One solution to this problem is to extract features at a variety of scales and then integrate features to enhance the discrimination.Brodu et al. [25] designed multi-scale dimensionality features to describe the local geometry of points.Huang et al. [26] proposed a new ship detection approach based on multi-scale heterogeneity features.
To improve matching accuracy, there are many approaches for removing outliers.Random sample consensus (RANSAC) [27] is a classical algorithm.It selects a sample randomly from the consensus set in each iteration and finds the largest consensus set to calculate the final model parameters.Inspired by this method, Wu et al. [28] proposed an improved RANSAC algorithm named fast sample consensus (FSC) to find the correct matches from correspondence set.However, if there are too many outliers in the correspondences, the RANSAC-based methods may be time-consuming and unstable.Some researchers utilize the local geometric information to remove outliers.Aguilar et al. [29] introduced a method called Graph Transformation Matching (GTM).In this method, a K-nearest neighbor (KNN) graph was constructed to describe the relationship between the current point and the neighbors.The point will be removed if the vertex structures are dissimilar between two images.Liu et al. [30] proposed a point matching algorithm based on Restricted Spatial Order Constraints (RSOC), which makes use of both local structure and global information in each iteration to remove outliers.However, when the KNN of the outliers are all the same, RSOC failed to remove such outliers.For images with large affine transformation and low overlapping areas, Zhang et al. [31] proposed novel point matching based on the triangle-area representation (TAR) of the K-nearest neighbors (KNN-TAR).Candidate outliers found by KNN-TAR will be confirmed to be real outliers or not by the local structure of the single matching pair and the global information of the whole matching pairs.These methods removed outliers based on affine transformation invariant and got good results for sun-synchronous satellite remote sensing image registration.As for GSMS images which have large scales, calculating the transformation error leads to heavy computational cost.In our previous work, an algorithm named neighborhood coding and verification (NCV) [32] was used to remove the mismatched pairs for GSMS images.Since this algorithm construct matrices about the spatial location relationship of feature distribution, the computational cost is also expensive when neighborhood is large.
Considering the methods mentioned above are almost always applied in sun-synchronous satellite image registration, a novel method should be proposed to solve the high time cost and matching error for GSMS image registration.In this paper, a feature-matching algorithm named slope-restricted multi-scale feature matching is proposed.The matched pairs are searched from coarse to fine.At first, both the GSMS images and the landmark images are subsampled to generate multi-scale images.The offsets between matched pairs calculated in the smaller scales are fed to next larger scales for narrowing the searching areas, accelerating the matching, and improving the accuracy.Distinct from the conventional removing methods, our method does not remove the outliers directly.Based on local geometric similarity, slope-restricted rectification is employed to rectify the mismatched pairs.The rest of this paper is organized as follows: Section 2 introduces the data and methods used in this study.Section 3 presents experimental results with various image pairs and performance evaluation of the proposed methods.Section 4 draws the conclusions.

Data
In this study, the GSMS images are provided by the Fengyun-2D geostationary meteorological satellite system [33].The scale of visible GSMS image is 9160 × 10, 000 pixels.Due to the multi-source, occlusion of cloud and illumination differences, it is difficult to seek reliable matching correspondences between two GSMS images.The landmark image generated from the Global Self-consistent, Hierarchical, High-resolution Geography (GSHHG) database [34] can serve as a transition media for matching tasks.GSHHG database contains the latitude and longitude information about the shorelines.With orbit radius and sub-satellite point, landmark images can be achieved by mapping GSHHG data onto the image plane.As presented in Figure 1a,b, the shorelines contained in the landmark image correspond to the contours in the GSMS image.The spatial resolution of landmark images is the same as in the GSMS images.One visible pixel corresponds to 1.25 km and one infrared pixel corresponds to 5 km.For convenience, the scale of GSMS images and landmark images is normalized to 10, 000 × 10, 000.Following works will focus on the feature matching between the GSMS images and the landmark images.Figure 1b is part of the GSMS image.Since landmark images only contain shoreline information, the edges extracted from GSMS images are used for image matching.The common edge detection approaches apply the first or second deviation to the smooth image and then find the local maxima, like the Sobel detector and Canny detector.In contrast to these approaches, the structured forests algorithm [35] trains edge detector is based on structured learning.This approach can learn the structured information and obtain more reliable edges relative to objects.As shown in Figure 2, structured forests outperform the others in noise reduction.Figure 2c is an edge probability map produced by structured forests; the intensity at row x and column y denotes the probability of a pixel p (x, y) being edge.To generate a feature map from the edge probability map, the threshold ε 1 (ε 1 is set to 0.16 empirically) is assigned to restrict the width of edges and reduce the noise of clouds, as presented in Figure 2d.The feature point q (x, y) at row x and column y of the feature map is generated as follows: Additionally, the feature point l (x, y) at row x and column y of the landmark map is generated as follows:

Multi-Scale Feature Matching
The remote sensing images have radial distortions.Since the earth is not a standard sphere, there is no projection model which can describe the transformation among the GSMS images.Feature-to-feature matching is essential for GSMS image registration.For conventional feature matching, the matched pairs are determined by comparing the similarity between the templates and the local regions of the image which center at the matching candidates.Assume a template L i with the size of (2T + 1) × (2T + 1) is generated at point l i (x i , y i ) in the landmark map.Meanwhile, a searching window with the size of (2S + 1) × (2S + 1) is determined in the same location (x i , y i ) in the feature map.Afterwards, a local region Q j with the size of (2T + 1) × (2T + 1) can be generated in the searching window.To find the best matching point q j for l i , the geometric similarity E geo is used to measure the similarity between L i and Q j .
Similarly to Q j , on the edge probability map, a local region P j centered at p j with the size of (2T + 1) × (2T + 1) is assisted to measure the gradient similarity E gra .
In addition, the number of landmarks located within the template L i is calculated as follows: Therefore, the best matching point q j for landmark l i is obtained by local feature matching [32].After matching features between the landmark images and the GSMS images, the matched pair set The points l n and q n are respectively located at (x n , y n ) and (x n , y n ).
Due to the large size of GSMS images, the searching area of feature matching is large and the computational cost is expensive.What is worse, there are too many similar features in the GSMS images which lead to low matching accuracy.To solve these problems, the GSMS images and the landmark images are subsampled to generate multi-scale images.The feature matching will be accomplished from a small scale to a large scale.Combined with the offsets between the matched pairs in the small scale images, the searching area can be narrowed for the large scale images.In this case, the matching accuracy and efficiency will be improved.Overall, this is a coarse to fine matching process.
As seen in Figure 3, suppose an image is subsampled with sampling frequency f .m indicates the m-th scale of the multi-scale images.The original image is of the scale 1.The larger m is, the smaller scale is.In this way, the multi-scale landmark maps and the multi-scale feature maps are established.l m i and q m i indicate the feature points in the m-th scale.Besides, maps are separated into patches to accelerate the matching process.To make a balance between accuracy and time cost, the size of patch is set to 400 × 400 on experience.To ensure all patches are 400 × 400, the patches which are smaller than 400 × 400 will be zero-padded.
For each patch pair in the m-th scale, the process of feature matching is the same as mentioned above.The only difference is that the center x s,m i , y s,m i of searching window is determined by Equations ( 6) and (7).
where ∆x m+1 and ∆y m+1 are the medians calculated from the offsets between the matched pairs in In this way, the searching window can be positioned more accurately and the matched pairs can be found more precisely in the larger scale.Consequently, the complete procedures of multi-scale feature matching are outlined in Algorithm 1.

The multi-scale landmark maps
The multi-scale GSMS maps Algorithm 1 Multi-Scale Feature Matching.

Input:
The multi-scale landmark maps, the multi-scale feature maps and the multi-scale edge probability maps; Output: The matched pair set for point l m i in the m-th landmark map do 3: Determine the center x s,m i , y s,m i of the searching window with Equations ( 6) and ( 7); if m is larger than 1 then

7:
Calculate the medians ∆x m and ∆y m from the offsets between the matched pairs in C m with Equations ( 8) and ( 9); end if 11: end for

Slope-Restricted Rectification
Since attitude parameters are accurately predicted, the rotation offsets of GSMS images are less than 10 −4 • for the FY-2 geostationary meteorological satellite system.In this case, the impact of rotation on images can be estimated and the landmark images are generated under rotation condition.Therefore, the rotation transformation can be neglected when registering GSMS images and landmark images.This can be proved as follows: Suppose the rotation between GSMS images and landmark images is given by: The offsets in the horizontal and vertical directions are denoted by: where θ ≤ 10 −4 • ; the value of ∆x and ∆y are both relative to x and y.Note that x and y are independent.Setting the center of the images which is close to sub-satellite point as origin of the image coordinate, x and y are both integers within [−5000, 5000] for 10, 000 × 10, 000 GSMS images.To find the maximum of ∆x, the partial derivations are calculated.
As seen from above, the maxima of ∆x and ∆y are both smaller than one pixel, hence the rotation between GSMS images and landmark images can be neglected.
However, due to the shift of sub-satellite point, there is still global translation between GSMS images and landmark images.Moreover, the landmark images are generated based on the standard sphere model while the earth is a non-standard spheroid.There are radial distortions between GSMS images and landmark images.As shown in Figure 4, although the latitude and longitude of P GSHHG and P GSMS are the same, their radii are different.It may lead to local translation between P GSHHG and P GSMS .As discussed above, the relationship between landmark images and GSMS images can be simplified by pure local translation.Therefore, the spatial context among the edge-points within two matching regions satisfy geometric similarity.That is, when two matching regions have similar spatial context, the matching lines are mostly parallel, as shown in Figure 5.In this case, most matching lines have the same or similar slopes (computed by Equation ( 15)).If the slope of a matched pair is different from others, this matched pair will be regarded as a mismatched pair.Hence, slope-restricted rectification (SRR) is proposed to rectify the mismatches.

The landmark image
The GSMS image mapping mapping In the rectification process, the matched pairs in C are refined in sequence.The slope set β = {β n | n = 1, 2, • • • , N} contains the slopes calculated from the connections between the matched pairs.
For each matched pair in C n K , the connection between them has the slope β k .Suppose there are J different slopes obtained from all connections based on the matched pairs in , which is created by counting the times of every kind of slope occurs, and N β j is the number of β j .If elements of N β are no less than is used to control the reference number of slopes in terms of K.According to the slope information contained in N β , the matched pair set Hence, the average slope β avg n is computed by the equations as follows: where d k is the Euclidean distance between l k and q k .w k is the weight about β k .
If slope β n of the connection between current matched pair satisfies the Equation ( 18), the matched pair (l n , q n ) will be added into rectification set C rect .Otherwise, the coordinate of q n will be rectified as follows: with, where ∆x n and ∆y n are average offsets calculated from the matched pairs in C n K .x rect n , y rect n is the coordinate of q rect n .Then, the rectified pair l n , q rect n will be added into rectification set C rect .However, if all elements of N β are smaller than ε 2 × K , the matched pair (l n , q n ) will be neglected.The complete processes of SRR are given in Algorithm 2.

Algorithm 2 Slope-Restricted Rectification (SRR).
Input: The matched pair set C and the slope set β; Output: The rectification set C rect ; 1: for n from 1 to N do 2: Select the K-nearest neighbor set L n KNN and matched pair set C n K for landmark l n ; 3: Calculate N β from the slopes relative to C n K ; 4: for j from 1 to J do Find the matched pairs whose connections have slopes are equal to β j and add these pairs into C n K ; 7: end if Add the matched pair (l n , q n ) into C rect ; 13: else 14: Calculate q rect n and add the rectified pair l n , q rect n into C rect ; 15: end if 17: end for

Experimental Results and Discussion
In the experiments, the sub-satellite point of Fengyun-2D is around (86 • E; 0 • N), and only landmarks located within ±60 • of longitude and ±60 • of latitude around sub-satellite point are chosen as testing data.Some shorelines in the landmark images could not be found from the GSMS images due to the occlusion of clouds.To measure the matching performance, the ground truth is labeled manually: for each landmark in the landmark image, we accurately mark its corresponding point in the GSMS image, except the landmark covered by clouds.There are in total 25 pairs of images used in the experiments and all results are presented with average values.All experiments are implemented on a workstation with dual Intel Xeon CPU (2.1 GHz and 12 cores for each) and 128 GB RAM.In multi-scale feature matching, four parameters, M, f , T and S, should be considered carefully to achieve optimal results.In scale 1, S and T are respectively set to 20 and 30 [32].Since we find some local offsets are larger than 400 pixels between the shorelines in the landmark images and the edges in the GSMS images, S and T are respectively set to 20 and 500 f M−1 for other scales.It ensures the areas which contain matching candidates exist in the searching windows of top scale.Once f and the image size are determined, M and S can be obtained.
As for the architecture of multi-scale maps, the number of scales M and the sampling frequency f are closely associated.As shown in Figure 6a, when f is set to 10, the matching precision and recall are very low.This indicates that most useful information is lost in the subsampled images with a size of 1000 × 1000.The inaccurate offsets between the matched pairs lead to quantitative mismatched pairs.Therefore, combined with different f , M is restricted to ensure the top scale image is larger than 1000 × 1000.The smaller f corresponds to higher matching precision, recall and time cost.When f changes from 7 to 9, the values of precision and recall float.In this case , the top scale images, with size smaller than 1500 × 1500, lost partially useful information and caused unstable matching results.Figure 6b presents the time cost with different f .We can see that the local minimum occurs at f = 3.When f is set to 3 and 4, the total number of pixels for multi-scale images are close.Meanwhile, the value of S for f set to 3 is less than half of f set to 4. Besides, the total number of pixels for f set to 2 is much larger than f set to 3, although they have the similar value of S. Therefore, the computational time for f set to 3 is less than f set to 2 and 4. Making a tradeoff among precision, recall and time cost, both f and M are assigned to 3 ultimately.

Comparative Analysis
Two classical feature matching algorithms, SIFT with nearest neighbor distance ratio (NNDR) [8] and shape context with the X 2 test statistic [15] are chosen to be compared with the proposed approach.The matching results are shown in Table 1 and Figure 7. From the table, the precision and the recall of our method are better than the shape context-based method.For the presented example in Figure 7a, the matched pairs obtained by SIFT-based method are much less than the required numbers.Both the precision and recall of this method are 0. Thus, SIFT is incapable of matching the GSMS images which have relatively few textures.As seen in Figure 7b, due to the monotonous patterns in the GSMS image, shape context generates many resembling features and leads to plenty of mismatched pairs.Besides, when comparing the computation time, multi-scale feature matching is more efficient than shape context.

Parameter Setting
To rectify unreliable matched pairs, it is critical to select optimal parameters for SRR.In Figure 8, with the increasing of K and ε 2 , the precision tends to increase and the recall tends to decrease.It indicates that when ε 2 is too large, the influence of neighbors is great and the constraint of geometric similarity is strict.Besides, when K is too large, the neighborhood is large and the geometric similarity between the current pair is low.In this case, many true matched pairs are regarded as mismatched pairs and removed.As a tradeoff between the precision and the recall, K is set to 24 and ε 2 is set to 0.5.

Performance Analysis
To evaluate the performance of the proposed method, a comparison among GTM, RSOC, KNN-TAR, NCV and SRR is performed.These methods are all based on geometric similarity.The results are presented in Table 2 and Figure 9.As shown in Table 2, the precision of SRR is slightly lower than GTM but the recall is higher than the other four.Because of large radial distortions in the GSMS images, the global information which contains noise leads to low precision values for RSOC and KNN-TAR.Since GTM and KNN-TAR are too strict with the spatial location relationship among feature points, their recall values are relatively low.NCV constructs the matrices about the spatial relationship among local features, the cost for matrix computation is relatively heavy.
As far as time is concerned, the experiments show that the proposed method is more efficient than the other four.In addition, SRR has the advantage of low computation complexity.When there are N matched pairs in C, the computation complexity of GTM, RSOC and KNN-TAR is O N 3 .Since NCV and SRR construct the K-nearest neighbors of current pair, their computation complexity is O N 2 logN .
Therefore, as mentioned above, SRR has a good ability to improve the matching accuracy and reduce the time cost for GSMS image registration.

Conclusions
In this paper, a novel slope-restricted multi-scale feature matching algorithm is proposed for GSMS image registration.This paper particularly focuses on reducing the time cost and matching error.Our method uses multi-scale mechanism to narrow the searching areas and find the matched pairs from coarse to fine.In the experiments, compared with SIFT and shape context, the multi-scale feature matching has the advantages in precision, recall and time cost.To improve the matching performance, the mismatched pairs are rectified with slope-restricted rectification, which is based on local geometric similarity between the landmark images and the GSMS images.The experimental results show that slope-restricted rectification is more efficient than GTM, RSOC, KNN-TAR and NCV.Based on the matched features, our future work will focus on three-dimensional spherical stitching for multi-view GSMS images.Additionally, we will explore to extend the proposed approaches to the registration of optical and SAR images, in which more challenging affine transformations exist.

Figure 1 .
Figure 1.Some examples of landmark images and geostationary meteorological satellite (GSMS) images.(a) Part of the landmark image generated from the Global Self-consistent, Hierarchical, High-resolution Geography (GSHHG) database; (b) Part of the GSMS image.

Figure 2 .
Figure 2. The edges of the GSMS image patch detected by different edge detectors.(a) The edges detected by Sobel detector; (b) The edges detected by Canny detector; (c) The edge probability map extracted by structured forests; (d) The feature map generated from the edge probability map.
is the total number of matched pairs for the (m + 1)-th scale.The points l m+1 n and p m+1 n are respectively located at x m+1 n

Figure 3 .
Figure 3.The architecture of multi-scale maps (suppose the sampling frequency f = 2).The subsampling is conducted from bottom to top, and the matching is inverse.The red grid corresponds to the template and the blue grid corresponds to the searching window.The medians ∆x m+1 and ∆y m+1 calculated in the (m + 1)-th scale are used to determine the center of the searching window in the m-th scale.

4 :
Seek the matched pair l m i , p m j with local feature matching and add it into C m ;

Figure 4 .
Figure 4.The illustration of local translation between GSMS images and landmark images.The blue sphere is a standard projection model for landmark images and the red spheroid is an earth model for GSMS images.The projection transformation preserves lines, thus there is radial translation between P GSHHG and P GSMS .

Figure 5 .
Figure 5.When mapping the shoreline of the landmark image onto the GSMS image, the connections between the matched pairs are generated.Since two images are relative by pure local translation, the spatial context among the edge-points within two matching regions are geometrically similar.Thus, the matching lines (green lines) are mostly parallel, and the non-parallel matches (red lines) will be rectified by slope-restricted rectification (SRR).

Figure 6 .
Figure 6.Average precision, recall and running time are relative to the sampling frequency f : (a) Average precision and recall with different sampling frequencies.(b) Running time with different sampling frequencies.

Figure 7 .
Figure 7. Comparing the matching result of multi-scale feature matching with scale-invariant feature transform (SIFT) and shape context in a pair of patches.The left and right patches are corresponding to the landmark image and the GSMS image, respectively.Besides, green points denote inliers and red points denote outliers.(a) SIFT.(b) Shape context.(c) Multi-scale feature matching.

Figure 8 .
Figure 8.The effects of the number of nearest neighbors K and the slope threshold ε 2 are used to rectify the mismatched pairs.(a) The average precision of SRR with different K and ε 2 .(b) The average recall of SRR with different K and ε 2 .

Table 1 .
Corresponding values of average precision, recall and computation time obtained by shape context and multi-scale feature matching.

Table 2 .
The performance comparison using different algorithms.GTM: Graph Transformation Matching; RSOC: Restricted Spatial Order Constraints; TAR: triangle-area representation; KNN: K-nearest neighbors; NCV: neighborhood coding and verification.