# Using Information Content to Select Keypoints for UAV Image Matching

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

#### Paper Aims and Contributions

## 2. Keypoint Selection Criteria

**Entropy:**This refers to the uniqueness of the keypoints and their descriptors. The entropy of a keypoint in the relevant Gaussian image and within a local region around the feature point shows its distinctiveness. It can be computed using the following equation.$$E=-{\displaystyle \sum}{P}_{i}lo{g}_{2}{P}_{i}.$$In Equation (1), P_{i}is the histogram-driven probability of the ith pixel within the local region around the detected keypoint. The scale of the extracted feature (σ) is used to select the relevant smoothed image. The local region is considered a surrounding circle with radius r = 3σ as suggested by [30].

- 2.
**Spatial saliency**: Another property dealing with image information content level is spatial saliency. This criterion deals with detecting visually distinct and prominent areas inside the image which are more critical and contain valuable information. The visual attention approach is the process that allows machine vision systems to detect the most essential and informative regions in the image. Among the various algorithms computing spatial saliency in images, the Weighted Maximum Phase Alignment (WMAP) model [42] offered appropriate results on various datasets [43]. This model uses the local phase (energy) information of the input data to achieve the saliency map by the integration of the WMAP measures for each pixel of each color channel (c) as follows:$$Saliency\left(x.y\right)={{\displaystyle \sum}}_{c=1}^{3}WMAP\left(x.y\right).$$To achieve the WMAP measures, the level of local phase alignment of the Fourier Harmonics is weighted by the strength of the visual structure in each scale to measure its importance. Therefore, the WMAP measure can be computed as follows:$$WMAP\left(x.y\right)={w}_{fdn}\xb7ma{x}_{i=1}^{s}\{\Vert \overrightarrow{{f}_{M.i}}\left(x.y\right)\Vert \xb7\mathrm{cos}{\theta}_{i}\},$$

- 3.
**Texture coefficient:**The grey values underlying the detected features should have enough variation to generate a unique descriptor for each keypoint. To define this variation, a coefficient is used that can define the local texture of the image. This coefficient is the standard deviation of N × N neighboring pixels surrounding each keypoint. It is computed using the gray values of the smoothed image produced during the keypoint extraction process using the following equation:$$txt-coef=\frac{{{\displaystyle \sum}}_{layer-1}^{layer+1}\sqrt{\frac{1}{{N}^{2}}{{\displaystyle \sum}}_{i=1}^{n}{{\displaystyle \sum}}_{j=1}^{m}{\left({x}_{ij}-\overline{x}\right)}^{2}}}{3}.$$In Equation (4), m and n are coordinates of the keypoint in the image, N is the number of neighbouring pixels around the target pixel, layer is the relevant DoG scale space image, ${x}_{ij}$ is the grey level value of the pixels in the relevant scale space and $\overline{x}$ is the mean grey level value in the template.

## 3. Evaluation Methodology

#### 3.1. Evaluations Using Synthetic Data

**Scaling:**Spatial size scaling of an image can be obtained by modifying the coordinates of the input image according to the following equation:$$\left[\begin{array}{c}{x}_{k}\\ {y}_{j}\end{array}\right]=\left[\begin{array}{c}{s}_{x}\\ {s}_{y}\end{array}\right]\xb7\left[\begin{array}{c}{u}_{q}\\ {v}_{p}\end{array}\right],$$

- 2.
**Rotation:**Rotation of an input image about its origin can be accomplished by the following equation:$$\left[\begin{array}{c}{x}_{k}\\ {y}_{j}\end{array}\right]=\left[\begin{array}{cc}\mathrm{cos}\theta & -\mathrm{sin}\theta \\ \mathrm{sin}\theta & \mathrm{cos}\theta \end{array}\right]\xb7\left[\begin{array}{c}{u}_{q}\\ {v}_{p}\end{array}\right],$$

- 3.
**2D Projective transform:**The projective transformation shows how the perceived objects change when the viewpoint of the observer changes. This transformation allows creating perspective distortion and is computed using the following equation:$$\left[\begin{array}{c}{x}_{k}\\ {y}_{j}\\ 1\end{array}\right]=\underset{H}{\underbrace{\left[\begin{array}{ccc}{a}_{1}& {a}_{2}& {b}_{1}\\ {a}_{3}& {a}_{4}& {b}_{2}\\ {c}_{1}& {c}_{2}& 1\end{array}\right]}}\xb7\left[\begin{array}{c}{u}_{q}\\ {v}_{p}\\ 1\end{array}\right],$$

- The initial keypoint locations and scales were extracted in both original and transformed images.
- The quality values for each keypoint were computed separately. The quality measures of entropy, spatial saliency and texture coefficient were computed using Equation (1), WMAP method and Equation (4), respectively. Each value of entropy, spatial saliency and texture coefficient was considered as the competency value of the initial keypoints.
- For each criterion, the mean competency value of all initial keypoints was considered as a threshold. Most researchers [30,34,35] define this threshold empirically. However, in this paper, we use the mean competency value which, first of all, is calculated robustly over the corresponding image. Secondly, as the values computed for each criterion are all positive, their average has a meaningful value that could well serve as the threshold for eliminating the weaker points. This way, the selected points all have a competency value that is above the average magnitude of all points.
- The initial keypoints are ordered based on their competencies. Here, only the keypoints having a competency value higher than the computed threshold were selected. Other keypoints were discarded.

#### 3.2. Evaluations Using Real Images

#### 3.3. The Quality Measures Used for Comparisons

**RMSE of the bundle adjustment:**This criterion expresses the re-projection error of all computed 3D points.**Average number of rays per 3D point:**It shows the redundancy of the computed 3D object coordinates.**Visibility of 3D points in more than three images**: It indicates the number of the triangulated points which are visible in at least 3 images in the block.**Average intersection angles per 3D points:**This criterion shows the intersection angle of 3D points, which are determined by the triangulation. A higher intersection angle of homologues rays provides more accurate 3D information.

#### 3.4. Implementation

## 4. Description of Datasets

## 5. Results

#### 5.1. Matching Results Obtained Using the Synthetic Dataset

#### 5.1.1. Repeatability

#### 5.1.2. Precision

#### 5.1.3. Recall

#### 5.1.4. Global Coverage

#### 5.1.5. Matching Results in Detectors Other Than SIFT

**Bold font**. The mean and standard deviations under different geometric transformations were calculated for each assessment criterion. The higher standard deviation results of the image detectors show their different performance. Further discussion is provided in this section.

#### MSER

**Bold font**. By comparing the achieved results under different geometric transformations, it can be seen that MSER extracts the most repeatable keypoints but does not perform well in the matching process. In all experiments, the precision and stability of the obtained results in original MSER detector was lower than the keypoint selection methods. Generally speaking, keypoint selection methods achieved better results compared to the original MSER in all experiments. MSER-entropy with an average precision of 73.41% and an average recall of 24.18% outperforms other keypoint selection methods in images with varying rotation changes. For scaled images, MSER-saliency achieves superior results with average precision and recall criteria of 70.74% and 95.5%, respectively. In viewpoint change transformation, MSER-entropy gets the best results in terms of precision but considering recall parameter, MSER-saliency is the best.

#### SURF

#### BRISK

#### 5.2. Results Obtained Using Real Images

#### 5.2.1. RMSE of the Bundle Adjustment

#### 5.2.2. Average Angles of Intersection

#### 5.2.3. Average Rays Per 3D Point

#### 5.2.4. Visibility of 3D Points in More Than Three Frames

#### 5.2.5. The Number of 3D Points

#### 5.2.6. Processing Time

## 6. Discussions

## 7. Development of a New Hybrid Keypoint Selection Algorithm

#### 7.1. RMSE of the Bundle Adjustment

#### 7.2. Average Angles of Intersection

#### 7.3. Average Rays Per 3D Point

#### 7.4. Visibility of 3D Points in More Than Three Frames

#### 7.5. The Number of 3D points

#### 7.6. Processing Time

## 8. Conclusions

- Each keypoint detector performs differently depending on the type of image and the type of scene. Enhancing the capabilities of the original detectors, keypoint selection strategies (i.e., entropy, saliency and texture coefficient), are advantageous for accurate image matching and UAV image orientation.
- Despite the reduction of keypoints, the keypoint selection strategies applied almost preserve the distribution of corresponding points.
- Information-based selection methods can effectively control the number of keypoints detected. As discussed (Section 5.1 and Section 5.2.5), in our experiments, the remaining keypoints range between 38% to 61% for the synthetic data and between 25% to 50% for the real data. As a result, the large number of detected keypoints that make the matching process time-consuming and error-prone is significantly decreased. The results of the image orientation also confirm this fact.
- It should be noted that keypoint selection algorithms are not capable of finding matched pairs with lower scale over entire image. This is because there is a lack of control over the scale of the keypoints in the selection process.

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

- Colomina, I.; Molina, P. Unmanned Aerial Systems for Photogrammetry and Remote Sensing: A Review. ISPRS J. Photogramm. Remote Sens.
**2014**, 92, 79–97. [Google Scholar] [CrossRef] [Green Version] - Nex, F.; Remondino, F. UAV for 3D Mapping Applications: A Review. Appl. Geomat.
**2014**, 6, 1–15. [Google Scholar] [CrossRef] - Hassanalian, M.; Abdelkefi, A. Classifications, applications, and design challenges of drones: A review. Prog. Aerosp. Sci.
**2017**, 91, 99–131. [Google Scholar] [CrossRef] - Bay, H.; Ess, A.; Tuytelaars, T.; van Gool, L. Speeded-up Robust Features (Surf). Comput. Vis. Image Underst.
**2008**, 110, 346–359. [Google Scholar] [CrossRef] - Leutenegger, S.; Chli, M.; Siegwart, R. Brisk: Binary Robust Invariant Scalable Keypoints. In IEEE International Conference on Computer Vision (ICCV); IEEE: New York, NY, USA, 2011. [Google Scholar]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis.
**2004**, 60, 91–110. [Google Scholar] [CrossRef] - Verdie, Y.; Yi, K.; Fua, P.; Lepetit, V. Tilde: A Temporally Invariant Learned Detector. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Wang, Z.; Fan, B.; Wu, F. Frif: Fast Robust Invariant Feature. In Proceedings of the BMVC, Bristol, UK, 9–13 September 2013. [Google Scholar]
- Xing, Y.; Zhang, D.; Zhao, J.; Sun, M.; Jia, W. Robust Fast Corner Detector Based on Filled Circle and Outer Ring Mask. IET Image Process.
**2016**, 10, 314–324. [Google Scholar] [CrossRef] - Krig, S. Interest Point Detector and Feature Descriptor Survey. In Computer Vision Metrics; Springer: Berlin/Heidelberg, Germany, 2016; pp. 187–246. [Google Scholar]
- Lee, H.; Jeon, S.; Yoon, I.; Paik, J. Recent Advances in Feature Detectors and Descriptors: A Survey. IEIE Trans. Smart Process. Comput.
**2016**, 5, 153–163. [Google Scholar] [CrossRef] [Green Version] - Ahmadabadian, A.H.; Robson, S.; Boehm, J.; Shortis, M.; Wenzel, K.; Fritsch, D. A comparison of dense matching algorithms for scaled surface reconstruction using stereo camera rigs. ISPRS J. Photogramm. Remote Sens.
**2013**, 78, 157–167. [Google Scholar] [CrossRef] - Heinly, J.; Dunn, E.; Frahm, J.-M. Comparative Evaluation of Binary Features. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2012; pp. 759–773. [Google Scholar]
- Miksik, O.; Mikolajczyk, K. Evaluation of Local Detectors and Descriptors for Fast Feature Matching. In Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012), Tsukuba, Japan, 11–15 November 2012. [Google Scholar]
- Mousavi, V.; Khosravi, M.; Ahmadi, M.; Noori, N.; Haghshenas, S.; Hosseininaveh, A.; Varshosaz, M. The performance evaluation of multi-image 3D reconstruction software with different sensors. Measurement
**2018**, 120, 1–10. [Google Scholar] [CrossRef] - Sedaghat, A.; Mohammadi, N. Uniform competency-based local feature extraction for remote sensing images. ISPRS J. Photogramm. Remote Sens.
**2018**, 135, 142–157. [Google Scholar] [CrossRef] - Karpenko, S.; Konovalenko, I.; Miller, A.; Miller, B.; Nikolaev, D. UAV Control on the Basis of 3D Landmark Bearing-Only Observations. Sensors
**2015**, 15, 29802–29820. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Konovalenko, I.; Kuznetsova, E.; Miller, A.; Miller, B.; Popov, A.; Shepelev, D.; Stepanyan, K. New Approaches to the Integration of Navigation Systems for Autonomous Unmanned Vehicles (UAV). Sensors
**2018**, 18, 3010. [Google Scholar] [CrossRef] [Green Version] - Lingua, A.; Marenchino, D.; Nex, F. Performance Analysis of the SIFT Operator for Automatic Feature Extraction and Matching in Photogrammetric Applications. Sensors
**2009**, 9, 3745–3766. [Google Scholar] [CrossRef] [PubMed] - Lerma, J.L.; Navarro, S.; Cabrelles, M.; Seguí, A.; Hernandez, D. Automatic orientation and 3D modelling from markerless rock art imagery. ISPRS J. Photogramm. Remote Sens.
**2013**, 76, 64–75. [Google Scholar] [CrossRef] - Safdari, M.; Moallem, P.; Satari, M. SIFT Detector Boosted by Adaptive Contrast Threshold to Improve Matching Robustness of Remote Sensing Panchromatic Images. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2019**, 12, 675–684. [Google Scholar] [CrossRef] - Wu, C. Towards linear-time incremental structure from motion. In Proceedings of the 2013 International Conference on 3D Vision, 3DV 2013, Seattle, WA, USA, 29 June–1 July 2013; pp. 127–134. [Google Scholar] [CrossRef] [Green Version]
- Zhao, J.; Zhang, X.; Gao, C.; Qiu, X.; Tian, Y.; Zhu, Y.; Cao, W. Rapid Mosaicking of Unmanned Aerial Vehicle (UAV) Images for Crop Growth Monitoring Using the SIFT Algorithm. Remote Sens.
**2019**, 11, 1226. [Google Scholar] [CrossRef] [Green Version] - Hartmann, W.; Havlena, M.; Schindler, K. Predicting Matchability. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA, 2014; pp. 9–16. [Google Scholar]
- Hartmann, W.; Havlena, M.; Schindler, K. Recent developments in large-scale tie-point matching. ISPRS J. Photogramm. Remote Sens.
**2016**, 115, 47–62. [Google Scholar] [CrossRef] - Papadaki, A.I.; Hansch, R. Match or No Match: Keypoint Filtering based on Matching Probability. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA, 2020; pp. 4371–4378. [Google Scholar]
- Royer, E.; Chazalon, J.; Rusinol, M.; Bouchara, F. Benchmarking Keypoint Filtering Approaches for Document Image Matching. In Proceedings of the 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), Kyoto, Japan, 9–15 November 2017; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA, 2017; Volume 1, pp. 343–348. [Google Scholar]
- Hossein-Nejad, Z.; Agahi, H.; Mahmoodzadeh, A. Image matching based on the adaptive redundant keypoint elimination method in the SIFT algorithm. Pattern Anal. Appl.
**2020**, 1–15. [Google Scholar] [CrossRef] - Hossein-Nejad, Z.; Nasri, M. RKEM: Redundant Keypoint Elimination Method in Image Registration. IET Image Process.
**2017**, 11, 273–284. [Google Scholar] [CrossRef] - Sedaghat, A.; Mokhtarzade, M.; Ebadi, H. Uniform Robust Scale-Invariant Feature Matching for Optical Remote Sensing Images. IEEE Trans. Geosci. Remote Sens.
**2011**, 49, 4516–4527. [Google Scholar] [CrossRef] - Royer, E.; Lelore, T.; Bouchara, F. COnfusion REduction (CORE) algorithm for local descriptors, floating-point and binary cases. Comput. Vis. Image Underst.
**2017**, 158, 115–125. [Google Scholar] [CrossRef] [Green Version] - Mukherjee, P.; Srivastava, S.; Lall, B. Salient keypoint selection for object representation. In Proceedings of the 2016 Twenty Second National Conference on Communication (NCC), Guwahati, India, 4–6 March 2016; Institute of Electrical and Electronics Engineers (IEEE): New York, NY, USA, 2016; pp. 1–6. [Google Scholar]
- Apollonio, F.; Ballabeni, A.; Gaiani, M.; Remondino, F. Evaluation of feature-based methods for automated network orientation. Int. Arch. Photogramm. Remote Sens. Spat. Inform. Sci.
**2014**, 40, 45. [Google Scholar] - Buoncompagni, S.; Maio, D.; Maltoni, D.; Papi, S. Saliency-based keypoint selection for fast object detection and matching. Pattern Recognit. Lett.
**2015**, 62, 32–40. [Google Scholar] [CrossRef] - Paul, S.; Pati, U.C. Remote Sensing Optical Image Registration Using Modified Uniform Robust SIFT. IEEE Geosci. Remote Sens. Lett.
**2016**, 13, 1300–1304. [Google Scholar] [CrossRef] - Xing, L.; Dai, W. A local feature extraction mcd on virtual line descriptors. Signal Image Video Process.
**2020**. [Google Scholar] [CrossRef] - Dusmanu, M.; Ignacio, R.; Tomas, P.; Marc, P.; Josef, S.; Akihiko, T.; Torsten, S. D2-Net: A Trainable Cnn for Joint Detection and Description of Local Features. arXiv
**2019**, arXiv:1905.03561. [Google Scholar] - Luo, Z.; Lei, Z.; Xuyang, B.; Hongkai, C.; Jiahui, Z.; Yao, Y.; Shiwei, L.; Tian, F.; Long, Q. Aslfeat: Learning Local Features of Accurate Shape and Localization. In Proceedings of the 2020 IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, 13–19 June 2020; pp. 6588–6597. [Google Scholar]
- Revaud, J.; Philippe, W.; César, D.S.; Noe, P.; Gabriela, C.; Yohann, C.; Martin, H. R2d2: Repeatable and Reliable Detector and Descriptor. arXiv
**2019**, arXiv:1906.06195. [Google Scholar] - Schönberger, J.L.; Hardmeier, H.; Sattler, T.; Pollefeys, M. Comparative evaluation of hand-crafted and learned local features. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 6959–6968. [Google Scholar]
- Matas, J.; Chum, O.; Urban, M.; Pajdla, T. Robust Wide-Baseline Stereo from Maximally Stable Extremal Regions. Image Vis. Comput.
**2004**, 22, 761–767. [Google Scholar] [CrossRef] - Lopez-Garcia, F.; Fdez-Vidal, X.R.; Dosil, R. Scene Recognition through Visual Attention and Image Features: A Comparison between Sift and Surf Approaches. In Object Recognition; InTech: Rijeka, Croatia, 2011. [Google Scholar]
- Borji, A.; Sihite, D.N.; Itti, L. Quantitative Analysis of Human-Model Agreement in Visual Saliency Modeling: A Comparative Study. IEEE Trans. Image Process.
**2013**, 22, 55–69. [Google Scholar] [CrossRef] [PubMed] [Green Version] - Kovesi, P.D. Matlab and Octave Functions for Computer Vision and Image Processing. Centre for Exploration Targeting, School of Earth and Environment, The University of Western Australia. Available online: https://www.peterkovesi.com/matlabfns (accessed on 23 March 2021).
- Vedaldi, A.; Fulkerson, B. Vlfeat: An Open and Portable Library of Computer Vision Algorithms. In Proceedings of the 18th ACM international conference on Multimedia, Firenze, Italy, 25–29 October 2010. [Google Scholar]
- Moré, J. The Levenberg-Marquardt algorithm: Implementation and theory. In Lecture Notes in Mathematicsl; Springer: Berlin/Heidelberg, Germany, 1978; Volume 630. [Google Scholar]
- Xiao, J. Sfmedu. Available online: http://vision.princeton.edu/courses/SFMedu/ (accessed on 23 March 2021).
- Applied Dataset. Available online: https://drive.google.com/drive/flers/1wAJ1HMv8nck0001wEQkf42OETClA-zvY?usp=sharing. (accessed on 23 March 2021).

**Figure 4.**The employed synthetic image dataset with different transformations: (

**a**) original image, (

**b**)rotation, (

**c**) scale, (

**d**)viewpoint transformations.

**Figure 5.**The employed real datasets with their images and different camera networks. They feature high-resolution images, variable image overlap, and texture-less surfaces.

**Figure 6.**An example of the keypoint selection for an image pair for SIFT detector, (

**a**) SIFT-entropy, (

**b**) SIFT-saliency, (

**c**) SIFT-texture-coefficient.

**Figure 7.**Repeatability results for the synthetic image pair of S1 (left column) and S2 (right column). (

**a**) Rotation, (

**b**) scale, (

**c**) viewpoint change.

**Figure 8.**Precision results for the synthetic image pair of S1 (left column) and S2 (right column). (

**a**) Rotation, (

**b**) scale, (

**c**) viewpoint change.

**Figure 9.**Recall results for the synthetic image pair of S1 (left column) and S2 (right column). (

**a**) Rotation, (

**b**)scale, (

**c**) viewpoint change.

**Figure 10.**Global coverage results for the synthetic image pair of S1 (left column) and S2 (right column). (

**a**) Rotation, (

**b**)scale, (

**c**) viewpoint change.

**Figure 11.**Results for the real image block orientation. (

**a**) Re-projection error of the bundle adjustment for each dataset, (

**b**) average intersection angles, (

**c**) average rays per computed 3D points, (

**d**) the visibility of the derived 3D points in more than 3 images, (

**e**) the number of 3D-points, (

**f**) processing time.

**Figure 12.**Comparison results of the proposed method. (

**a**) Re-projection error of the bundle adjustment, (

**b**) average intersection angles, (

**c**) average rays per computed 3D points, (

**d**) the visibility of the derived 3D points in more than 3 images, (

**e**) the number of 3D-points, (

**f**) processing time.

Algorithm | Parameter Name | Value |
---|---|---|

SIFT | Number of octaves | 4 |

Number of layers | 3 | |

Initial Gaussian smoothing | 1.6 | |

SURF | Number of octaves | 3 |

Number of layers | 3 | |

MSER | Minimum region area | 30 pixels |

Maximum area variation | 0.2% | |

The step size between the intensity threshold level | 1.2–5 *% | |

BRISK | Number of octaves | 4 |

Minimum intensity difference between a corner and its surrounding region | 0.28–0.41 * | |

Minimum accepted quality of corners | 0.28–0.41 * |

Dataset | No. of Images | Camera Model | Sensor Size (mm) | Resolution (pixel) | Pixel Size (µm) | Nominal Focal Length (mm) |
---|---|---|---|---|---|---|

R1 | 8 | DJI FC6310 | 13.2 × 8.8 | 5472 × 3648 | 2.41 | 8.8 |

R2 | 18 | DJI FC6310 | 13.2 × 8.8 | 5472 × 3648 | 2.41 | 8.8 |

R3 | 15 | DJI FC6520 | 13 × 17.3 | 5280 × 3956 | 3.28 | 12 |

Rotation | Scale | Viewpoint Change | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

MSER | MSER-Entropy | MSER-Saliency | MSER -Texture Coef. | MSER | MSER-Entropy | MSER-Saliency | MSER-Texture Coef. | MSER | MSER-Entropy | MSER-Saliency | MSER- Texture Coef. | ||

Repeatability (%) | Mean | 83.57 | 58.53 | 62.34 | 60.35 | 47.17 | 37.28 | 41.93 | 43.29 | 39.33 | 26.32 | 32.31 | 29.17 |

Std. | 6.32 | 11.39 | 14.08 | 8.00 | 19.74 | 19.85 | 18.30 | 18.20 | 2.03 | 2.44 | 3.35 | 3.68 | |

Precision (%) | Mean | 45.09 | 73.41 | 45.83 | 60.08 | 18.44 | 45.23 | 70.74 | 58.86 | 34.58 | 76.48 | 55.59 | 53.84 |

Std. | 11.06 | 10.46 | 23.31 | 20.26 | 14.41 | 23.18 | 12.65 | 13.88 | 7.88 | 8.62 | 13.65 | 12.41 | |

Recall (%) | Mean | 19.84 | 24.18 | 17.04 | 19.40 | 99.83 | 67.64 | 95.54 | 85.84 | 19.82 | 24.29 | 17.70 | 20.60 |

Std. | 0.17 | 8.95 | 11.34 | 13.76 | 0.21 | 17.55 | 5.50 | 7.47 | 0.24 | 5.82 | 8.02 | 6.21 | |

Global coverage (%) | Mean | 57.07 | 29.49 | 10.47 | 20.65 | 14.00 | 9.23 | 4.35 | 10.99 | 62.01 | 31.66 | 10.69 | 21.23 |

Std. | 5.01 | 12.94 | 4.40 | 17.45 | 3.43 | 2.68 | 487 | 1.65 | 7.22 | 15.96 | 7.43 | 9.17 | |

RMSE (pixel) | Mean | 0.72 | 0.71 | 0.75 | 0.72 | 1.39 | 1.27 | 1.414 | 1.413 | 0.70 | 0.74 | 0.56 | 0.64 |

Std. | 0.31 | 0.27 | 0.43 | 0.29 | 0.04 | 0.28 | 0.0006 | 0.0007 | 0.07 | 0.13 | 0.22 | 0.12 |

Rotation | Scale | Viewpoint Change | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

SURF | SURF -Entropy | SURF -Saliency | SURF -Texture Coef. | SURF | SURF -Entropy | SURF -Saliency | SURF -Texture Coef. | SURF | SURF -Entropy | SURF -Saliency | SURF - Texture Coef. | ||

Repeatability (%) | Mean | 50.39 | 31.17 | 39.00 | 33.77 | 86.58 | 63.46 | 72.07 | 82.26 | 13.58 | 9.11 | 11.69 | 11.74 |

Std. | 12.65 | 13.71 | 13.07 | 16.39 | 2.69 | 5.24 | 4.83 | 2.02 | 2.04 | 1.33 | 2.35 | 2.79 | |

Precision (%) | Mean | 61.23 | 81.50 | 76.99 | 68.44 | 59.19 | 66.02 | 68.85 | 77.43 | 22.02 | 48.75 | 57.85 | 43.96 |

Std. | 16.72 | 9.00 | 12.68 | 17.54 | 14.34 | 11.27 | 12.97 | 11.87 | 7.47 | 9.28 | 9.96 | 12.19 | |

Recall (%) | Mean | 45.45 | 86.08 | 78.90 | 79.25 | 99.55 | 99.33 | 99.90 | 99.95 | 25.37 | 55.42 | 61.88 | 54.26 |

Std. | 15.83 | 6.62 | 11.39 | 9.39 | 0.39 | 0.80 | 0.12 | 0.07 | 7.65 | 14.39 | 10.02 | 7.82 | |

Global coverage (%) | Mean | 81.56 | 80.39 | 34.95 | 62.89 | 25.56 | 21.47 | 11.10 | 21.67 | 35.27 | 44.83 | 31.01 | 39.62 |

Std. | 3.98 | 2.01 | 1.39 | 14.43 | 15.62 | 14.78 | 8.27 | 14.41 | 11.39 | 10.31 | 4.39 | 13.47 | |

RMSE (pixel) | Mean | 0.56 | 0.62 | 0.63 | 0.56 | 1.39 | 1.37 | 1.40 | 1.410 | 0.64 | 0.72 | 0.71 | 0.71 |

Std. | 0.13 | 0.14 | 0.13 | 0.10 | 0.01 | 0.011 | 0.013 | 0.002 | 0.03 | 0.04 | 0.03 | 0.06 |

Rotation | Scale | Viewpoint Change | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

BRISK | BRISK -Entropy | BRISK -Saliency | BRISK -Texture Coef. | BRISK | BRISK -Entropy | BRISK -Saliency | BRISK -Texture Oef. | BRISK | BRISK -Entropy | BRISK -Saliency | BRISK - Texture Coef. | ||

Repeatability (%) | Mean | 74.12 | 50.51 | 60.77 | 56.73 | 91.89 | 75.89 | 76.67 | 81.76 | 29.42 | 25.30 | 17.43 | 25.23 |

Std. | 5.05 | 8.23 | 7.10 | 5.79 | 2.07 | 3.46 | 4.78 | 3.97 | 0.93 | 1.62 | 1.89 | 0.93 | |

Precision (%) | Mean | 72.09 | 71.53 | 78.22 | 71.77 | 71.74 | 62.04 | 54.40 | 67.66 | 51.09 | 56.11 | 62.60 | 51.75 |

Std. | 11.96 | 9.00 | 8.05 | 10.39 | 10.55 | 11.26 | 15.76 | 10.86 | 10.74 | 8.78 | 7.18 | 9.40 | |

Recall (%) | Mean | 41.87 | 50.96 | 52.16 | 51.58 | 94.73 | 90.07 | 90.97 | 93.65 | 23.63 | 37.71 | 42.43 | 35.77 |

Std. | 16.49 | 12.79 | 13.51 | 14.02 | 1.70 | 4.36 | 4.84 | 3.04 | 6.23 | 9.66 | 10.16 | 9.61 | |

Global coverage (%) | Mean | 71.91 | 57.24 | 30.89 | 58.48 | 24.25 | 17.64 | 9.75 | 18.09 | 40.36 | 32.78 | 16.73 | 31.53 |

Std. | 5.04 | 6.13 | 2.04 | 4.64 | 14.28 | 9.90 | 6.93 | 10.08 | 11.49 | 8.27 | 5.70 | 9.78 | |

RMSE (pixel) | Mean | 0.40 | 0.46 | 0.45 | 0.44 | 1.397 | 1.380 | 1.382 | 1.392 | 0.44 | 0.50 | 0.50 | 0.48 |

Std. | 0.03 | 0.05 | 0.05 | 0.05 | 0.002 | 0.005 | 0.007 | 0.004 | 0.04 | 0.05 | 0.05 | 0.06 |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Mousavi, V.; Varshosaz, M.; Remondino, F.
Using Information Content to Select Keypoints for UAV Image Matching. *Remote Sens.* **2021**, *13*, 1302.
https://doi.org/10.3390/rs13071302

**AMA Style**

Mousavi V, Varshosaz M, Remondino F.
Using Information Content to Select Keypoints for UAV Image Matching. *Remote Sensing*. 2021; 13(7):1302.
https://doi.org/10.3390/rs13071302

**Chicago/Turabian Style**

Mousavi, Vahid, Masood Varshosaz, and Fabio Remondino.
2021. "Using Information Content to Select Keypoints for UAV Image Matching" *Remote Sensing* 13, no. 7: 1302.
https://doi.org/10.3390/rs13071302