Automatic Generation of Seamless Mosaics Using Invariant Features

The acquisition of satellite images over a wide area is often carried out across seasons because of satellite orbits and atmospheric conditions (e.g., cloud cover, dust, etc.). This results in spectral mismatch between adjacent scenes as the sun angle and the atmospheric conditions will be different for different acquisitions. In this work, we developed an approach to generate seamless mosaics using Scale-Invariant Features Transformation (SIFT). In this process, we make use of the overlapping areas between two adjacent scenes and then map spectral values of one imagery scene to another based on the filtered points detected by SIFT features to create a seamless mosaic. We make use of the Random Sample Consensus (RANSAC) method successively to filter out obtained SIFT points across adjacent tiles and to remove spectral outliers across each band of an image. Several high resolution satellite images acquired with WorldView-2 and Dubaisat-2 satellites, and medium resolution Sentinel-2 satellite imagery are used for experimentation. The experimental results show that the proposed approach can generate good seamless mosaics. Furthermore, Sentinel-2’s level 2A (L2A) product surface reflectance data is used to adjust the spectral values for color consistency.


Introduction
Satellite imagery is used in several earth observation applications such as land monitoring, classification, object detection, damage assessment, etc. Such applications often span across a wider area and at times country-scale or even larger. This area of interest cannot always be captured in one image due to the limitation in the field of view of satellite sensors and the satellite orbits. Due to the limitations of the imaging swath or acquisition mechanism, it is common that the region of interest (ROI) cannot be contained in only one remote sensing image scene. Therefore, it is necessary to generate a large-scale composite image with a sequence of overlapped satellite scenes or tiles, which is usually referred to by a technique called image mosaicking [1]. Image mosaicking consists of five aspects which include image registration, extraction of overlapping areas, radiometric normalization, seam-line detection, and finally, image blending [2].
When the area of interest spans a wider area, the mosaicking process often has color balancing issues due to different acquisition conditions of underlying scenes. Generally, there are two desirable properties required in generating visually pleasant color-balanced mosaic [3]. First, the output mosaic should be as similar as possible to the input images in terms of intensity and geometry. Furthermore, secondly, the stitching line should not appear in the mosaic output. Hence, to create a good seamless mosaic, one would need constituent satellite tiles that are geo-rectified and acquired within a few days or at most within the same season.
With an assumption that the tiles with partially overlapped regions have already been rectified by using image registration methods [4][5][6], image mosaicking methods can be reviewed and classified into two categories: blending based method and optimal seam based method [7,8]. The blending-based method that is commonly known as alpha blending or feathering approach [9][10][11][12][13][14], acts by obtaining smooth transition in the overlapped region. The mosaic output is generated considering a weighted combination of two overlapped images, where the weight coefficient is determined according to the distance from the stitching borders or the centers. Some advanced weighted blending techniques are constructed by using variational approach [15] or Poisson-based gradient-domain compositing approach [16]. However, the blending-based mosaic methods have the drawback that ghosting and blurring artifacts could be introduced if overlapped images are misaligned [3,17,18]. On the other hand, the optimal seam-based methods look to fit a curve in the overlapping region, so that the minimum difference is obtained between the two registered images [18]. With the overlapping region being divided into two parts by the searched curve, each part can then be filled with the pixels in the corresponding image. In the literature, many optimal seam searching methods are used that make use of Dijkstra's algorithm [19][20][21], dynamic programming [18,[22][23][24], and graph-cuts [25,26]. However, the optimal seam-based methods are found to have problems in generating visually pleasant mosaic output with apparent intensity differences [27]. Hence, histogram matching-based pre-processing has been extensively used [28][29][30] to avoid problems related to the visual inconsistency caused by unbalanced intensity. The histogram matching technique uses a statistics-based intensity balancing method and is found to be effective in adjusting the global intensity distribution of the target image to fit the reference image through a non-linear transformation.
Considering the pros and cons of the blending-based methods and the optimal seambased methods, both types of methods were also incorporated into one framework to improve the performance of a mosaic [31]. Usually, a low-pass filter is applied to decompose each overlapped image into two components of high-frequency and low frequency. Furthermore, mosaic schemes are designed for stitching each component. The weighted approach of blending rule-based mosaic scheme is used to achieve smooth intensity transition for low-frequency components consisting of uneven illumination information. With very little information available in the low-frequency components, the weighted blending rule can help to avoid ghosting or blurring effect [9]. However, for high-frequency components that consist of detailed information (for example edges, structures, and texture), an optimal seam searching method is modified to determine the seam line and perform the stitching process within the overlapping area [22][23][24]. Thus, modified optimal seambased mosaic methods work well in preserving structure and texture consistency in the high-frequency components with proper intensity information. Then the final mosaic is obtained by jointly considering the linear composition of the high-frequency and the low-frequency components.
In this work, we make use of the idea of an optical seam-based approach to find the appropriate points in the curve, and then, a blending process is used to create a mosaic. We use a two-step filtration approach: first, using Random Sample Consensus (RANSAC) [32][33][34] method, collinear points are obtained using Scale-Invariant Feature Transformation (SIFT). In the second step, we again use RANSAC as a robust approach to remove spectral outliers across each band to determine the linear function for mapping adjoining tile onto the base tile [34]. This method helps to properly deal with color balancing issues by mapping spectrally-consistent control points between the images. Furthermore, we also correct the spectral values of the resulting mosaic to map to standard surface reflectance values. This allows us to relate spectral values of unknown objects from satellite mosaics to the available standard spectral library for more accurate change detection, classification, and object detection [35]. The remapping of the values can be performed either by collecting the stable ground spectra or by using validated standard reflectance products. In our approach, the standard reflectance product of Sentinel's L2A is considered for re-mapping. Since the sensors of WV-2 and Sentinel-2 do not perfectly match, slight inconsistencies are expected for actual reflectance values, but the color consistency is guaranteed across scenes. The rest of the manuscript is organized as follows. 'Proposed Methodology' is provided in Section 2, which is followed by 'Results and Discussion' in Section 3, while Section 4 is 'Conclusions and Future Work'.

Proposed Methodology
In our approach, we use a two-step approach for extracting the feature points to be used for blending purposes. For this, we first find scale-invariant features in the overlapped region of the adjacent tiles. The Random Sample Consensus (RANSAC) approach is used to find the collinear points in the scale-invariant features across two adjacent tiles based on distance [36]. The RANSAC approach is used again as a robust approach for further removing the spectral outliers across each band. A linear function is determined to map the spectrally-consistent pixels for each band of a tile to map it onto the base image tile to create the homogeneous mosaic. The overview of the proposed process can be found in Figure 1, and its details are explained in Algorithm 1.

Algorithm 1 Proposed Workflow
for each tile in two adjacent image tiles do Consider one tile as base and other tile as adjoining tile Ensure the two tiles are geo-rectified Use blending approach to determine the homogeneity of consisted points in the blended image in each band of the image if The points are homogeneous then Generate a mosaic using a blending process Repeat the process for the remaining adjacent image tiles else Find scale invariant (SIFT) features within the overlapped region of two tiles Apply a RANSAC approach to retrieve set of collinear points among the SIFT features for each band in two adjacent image tiles do Use RANSAC approach as robust approach and develop linear models for every band to map adjoining image onto the base image by removing spectral outliers across each band Apply the fitting function from each band of the adjoining image onto base tile to obtain homogeneously distributed data points across each band end for Generate a mosaic using a blending process Apply the proposed approach Repeat the process for the remaining adjacent image tiles end if end for

Extraction of Robust Key Match Points in Overlapping Regions
We employ Scale-Invariant Feature Transform (SIFT) on the base image and adjoining image to detect the key match points across the adjacent tiles. To select the robust key match points, we employ the RANSAC algorithm with the homography model [37], to avoid misleading points. The RANSAC method computes a homography and provides a prediction of outliers. Using SIFT, we generate keypoints and the feature vector for each keypoint. Furthermore, if k is a keypoint in image x and k is a keypoint in image x , the feature vectors provide a way to identify whether k, k are good matches or not. Then, for each k, we look for a good match: a k from x that is significantly closer to k than any other keypoint in x . The set formed with a good matches forms a potential candidate set for image registration.
Algorithm 2 shows steps of the RANSAC algorithm. We initially pick four noncollinear feature pairs randomly to compute a homography from these four pairs, and then test the goodness of this homography, by checking whether the number of good matches is consistent with this homography or not. If k, k are a good match, we assume that in most cases the homography will map k to something near k . The one with good matches that are consistent with the homography would be treated as inliers, and those that are inconsistent are considered as outliers (for this homography) for the chosen threshold. By using a trial and error method, a good choice for threshold value was found to be 65 for dealing with the case of our experiment of finding collinear points across adjacent satellite scenes. Eventually we repeat this process 100 times and pick a set of four good matches each time. In each iteration, a homography is derived and the number of associated outliers is counted. At the end, we obtain the homography with the smallest number of inliers that constitute our feature points for image mapping. This step provides a set of refined points from a pool of SIFT feature points obtained from the base and adjoining images. With the first application of RANSAC, it allows filtering SIFT feature points across the overlapping region to find eligible points based on distance. Once the SIFT points are filtered out using RANSAC algorithm, the obtained points are further refined based on spectral values in individual bands using RANSAC as a robust approach to identify spectral outliers across each band. A linear fitting function is calculated for each band, to map the adjoining image tile onto the base image tile. Furthermore, finally, a blending method is used to create the final homogeneous mosaic.

Blending Method
Overlapping regions of two images may have different brightness or intensities. A simple way to perform blending is to use a weighted sum of the image intensities along each band using weight functions [38].
We can assign a weight function to each image w(x, y) = w(x)w(y) where w(x) varies linearly from 1 at the center of the image to 0 at the edge. If there exists a small registration error, this process may lead to blurring of high-frequency detail. Hence, a multi-band blending strategy developed by Burt and Adelson [39] is applied. The rationale behind using multi-band blending is to blend low frequencies over a large spatial range and high frequencies over a short-range.

Mapping of Mosaic to Standard Surface Reflectance
A color balanced mosaic obtained after a successful color blending approach leads to a change in the spectral values in the considered tiles/scenes. To guarantee spectral consistency, an approach to map mosaic values to standard surface reflectance is required. The remapping of the values can be performed either by collecting the ground reflectance values for stable surfaces or by using known standard reflectance products such as Sentinel-2 L2A data. We apply the procedure similar to the approach explained in Sections 2.1 and 2.2 for remapping. A linear robust regression method is applied in each band of the obtained mosaic to map the features to the surface reflectance.

Evaluation of Proposed Approach
Correlation Coefficient: The correlation coefficient helps to reflect the similarity between the base image and the final mosaic. The higher the correlation value, the higher will be the fidelity of the image spectrum and the better is the fusion quality. It can be calculated as shown in Equation (1).
Here, f 1 (i, j) is the image pixel value of the base image, f 2 (i, j) is the mapped color pixel value after mosaicking. A 1 and A 2 represent the pixel values of the two images to be mosaicked, respectively. The larger the value of C.C. in Equation (1), the greater will be the correlation between the base image tile and the final mosaic.

Satellite Images
Three different satellite datasets are used to demonstrate the effectiveness of our proposed approach. The datasets were acquired over the region of UAE during 2015/2016 using WorldView-2, Sentinel-2, and DubaiSat-2 satellites.
The WorldView-2 dataset consists of high resolution satellite imagery with eight multi-spectral bands at a spatial resolution of 2 m. The panchromatic band having 0.5 m spatial resolution is used to improve the spatial resolution of the images. Only 4 bands (R-5, G-3, B-2, and IR-8) of WorldView-2 data are considered in this work for the images acquired over the Northern Emirates of UAE in summer (June/July) months of 2014/2015. Sentinel-2 provides medium spatial resolution satellite imagery which is freely available. In this work, we use imagery acquired over UAE during the summer months of 2016. Only four bands available at 10 m spatial resolution are used. The third dataset consists of high resolution satellite imagery having 4 bands at 4m spatial resolution acquired over Dubai emirate during summer months of 2016 with DubaiSat-2 satellite. The image tiles in all the three datasets have significant color variation making it difficult to produce high quality mosaics using traditional methods.

Experiments, Observation and Analysis
The experimental setup is explained in Section 2. The process has been applied to all the three mentioned datasets and an illustration of the process is provided with the WorldView-2 and DubaiSat-2 dataset. Figure 2a shows the adjacent tiles of the WorldView-2 dataset for the Northern Emirates region in the UAE. The acquisition of tiles is mostly during the summer time in 2014, with some of the tiles acquired in 2015 that led to mosaicking issues. These issues can be clearly observed in Figure 2b, where significant differences in spectral values across some tiles led to color balancing issues during a simple blending procedure. Figure 3a shows the overlapping region in the sample adjacent tiles. The SIFT features are extracted followed by a RANSAC procedure that would allow obtaining selected points by removing distance outliers from the adjacent tiles as shown in Figure 3b. This process is followed by another application of the RANSAC method in each band to remove spectral outliers, after which a linear function for each band is developed to map the adjoining tile onto the base tile. An example of such mapping for one band is demonstrated in Figure 4. Then a blending process is followed to obtain the mosaic as seen in Figure 5a. Finally, the obtained mosaic is mapped to standard surface reflectance using a robust linear regression approach to obtain an improved surface reflectance product as seen in Figure 5b. Thus, the obtained mapped surface reflectance product is observed to have a reflectance value within the 3% limit of the considered standard surface reflectance values of stable objects such as sand, roads, alluvial plains, and roofs.     Figure 6a shows the DubaiSat-2 adjacent tiles where the considered leftmost base tile seem to consist of spectral values that are very different from standard surface reflectance. As a result, the obtained color-balanced mosaic is also observed to have spectral values in line with this considered base tile as shown in Figure 6b. Afterward, the mosaic is remapped to surface reflectance (shown in Figure 7) which is observed to have a reflectance value within the 2% limit of the considered standard surface reflectance values of above mentioned stable objects.   Table 1 shows how a base tile relates to a color-balanced mosaic tile using a proposed approach, and we can clearly see that the correlation coefficient concerning all four datasets is very high. A correlation coefficient is calculated based on the pixel value of the base tile (before mosaicking) and the same corresponding pixel value of the final mosaic. There are no significant differences one can observe visually between mosaics of Sentinel-2's level 2A and 1C product data as shown in Figures 8 and 9 but statistically, the correlation coefficient seems to decay more in the case of Sentinel-2's level 1C data. Image mosaicking has been traditionally used to develop basemaps primarily for visualization. However, such data may not be ideal for applications like change detection, classification, object detection, etc. In addition to this, since the sun angle tends to vary significantly from one month to another, the acquisition of images must be within the same period (within a few days to a maximum time of a season), otherwise, the spectral values will differ leading to significant color-balancing issues in the image mosaic. As an example, significant color balancing issue can be clearly noticed while trying to create a mosaic from Copernicus's Sentinel-2 images that have a revisit time of 10 days and whose data products are freely available at Level 1C and Level 2A processing levels. The mosaicking process has relatively fewer issues while using Level 2A data which is a reflectance product at surface level (also known as Top of Canopy (TOC)) in comparison to using level 1C which is a reflectance product at top of the atmosphere (TOA) due to the extra processing to map to surface characteristics. However, irrespective of the data product used, the proposed approach will map all the data products whether it is a radiance product or, reflectance product at TOA or TOC to obtain the acceptable surface reflectance mosaic.   The computational cost of the proposed method depends on the size and resolution of the dataset used. A standard multi-spectral scene or tile of WorldView-2 satellite consists of 4096 rows and 4096 columns contributing towards the file size of about 256 MB for 8 bands that covers about 8 km in length as well as in breadth. In our experimentation, the execution time for mosaicking two scenes of the WorldView-2 high-resolution dataset is about 2 min. It usually takes less than 30 s for extracting the filtered feature points within the overlapping region and up to 1 min for mapping the adjacent scene based on the baseline scene and performing a blending procedure. Similarly, depending upon the standard tile size of the WorldView-2 data, remapping the reflectance values back to the physical domain in a mosaic of two tiles takes less than 1 min of execution time.

Comparison with other Methods
The proposed approach has been developed to resolve the color balancing issues encountered with the parametric mosaicking methods available with commercial software such as ArcGIS and ENVI. The advantage of using our approach over other approaches and other available methods in proprietary software is that we simulate the reflectance values to achieve consistency in mosaicking during the process of color balancing or blending operations. This is especially proven to be extremely useful when performing the classification of habitats and sub-habitats consisting of more than 30 classes where a slight change in spectral values could alter and induce errors in classification outputs. Otherwise, automatic object mapping or classification in mosaic images may not provide accurate results. In our proposed approach, the RANSAC method was used successively to filter out the obtained SIFT points across adjacent tiles and to remove spectral outliers across each band of an image so that proper mapping of the spectral reflectance values can be performed. In addition to this, our approach has demonstrated a good seamless mosaic output in various three demonstrated datasets. Figures 10 and 11 show the output of various histogram and overlapping based color balancing approaches performed in the WorldView-2 images consisting of the parts of the Northern Emirates, UAE. In Figure 10, we can clearly observe that when information across bands over adjacent tiles are not linearly represented, it causes distinct color balancing issues. Even though histogram-based methods are found to be effective in many cases, the non-uniform distribution of information across the bands leads to mosaicking issues in such approaches. In Figure 10a-c, we can observe slight greenish and pinkish tint in the obtained mosaic while in Figure 10d, we can clearly observe how our proposed approach can handle color balancing issues and provide better mosaic as compared to other overlapping methods with different mentioned parameters for the earlier mentioned case.

Shortcomings of the Proposed Approach
The proposed approach has the following shortcomings. To establish proper mapping function from adjoining tile onto the base tile, sufficient points or SIFT features across the overlapping region of these two tiles need to exist, for which there has to be significant overlapping between the two adjacent tiles to properly represent the different population (object classes) of the constituent images. In the case of some tiles, whose overlapping is not sufficient, another set of tiles that has overlapping are required so that sufficient points are available to develop mapping functions. Furthermore, the overlapping region needs to contain features from various classes of objects. The presence of only a few specific classes of objects in the overlapping region makes the mapping function biased towards those few object classes only and the proposed approach may not work well with unseen classes of objects outside the overlapped region. Therefore, it is required to have significant overlapping between the tiles with the presence of various diverse classes of objects.
In the case of mapping spectral values of obtained mosaic to standard surface reflectance, it would have been better to use stable ground spectra as sensor bands of WorldView-2 and DubaiSat-2 do not perfectly match with Sentinel-2. Hence, while mapping obtained mosaic to surface reflectance using Sentinel-2 L2A product, some inconsistencies were observed as a result of which surface reflectance values for WorldView-2 and DubaiSat-2 differ by 3% and 2% respectively.

Conclusions and Future Work
We have developed an approach to generate seamless mosaics based on SIFT features by filtering outliers based on a robust approach. The RANSAC method was used successively to filter the SIFT points across adjacent tiles and to also remove spectral outliers across each band of the images. Our approach has demonstrated a good seamless mosaic output in various datasets. In our approach, we also demonstrated remapping the color-balanced mosaic image to the standard surface reflectance values. This is necessary for mapping and relating spectral values of unknown objects with the available standard spectral library especially when the task is to perform habitat and sub-habitat classification consisting of several classes. In the future, we would like to use stable ground surface reflectance measurements collected over a systematically sampled region of a mosaic to map its color-balanced mosaic to standard surface reflectance.