An Improved RANSAC Outlier Rejection Method for UAV-Derived Point Cloud

: A common problem with matching algorithms, in photogrammetry and computer vision, is the imperfection of ﬁnding all correct corresponding points, so-called inliers, and, thus, resulting in incorrect or mismatched points, so-called outliers. Many algorithms, including the well-known randomized random sample consensus (RANSAC)-based matching, have been developed focusing on the reduction of outliers. RANSAC-based methods, however, have limitations such as increased false positive rates of outliers, and, consequently resulting in fewer inliers, an unnecessary high number of iterations, and high computational time. Such deﬁciencies possibly result from the random sampling process, the presence of noise, and incorrect assumptions of the initial values. This paper proposes a modiﬁed version of RANSAC-based methods, called Empowered Locally Iterative SAmple Consensus (ELISAC). ELISAC improves RANSAC by utilizing three basic modiﬁcations individually or in combination. These three modiﬁcations are (a) to increase the stability and number of inliers using two Locally Iterative Least Squares (LILS) loops (Basic LILS and Aggregated-LILS), based on the new inliers in each loop, (b) to improve the convergence rate and consequently reduce the number of iterations using a similarity termination criterion, and (c) to remove any possible outliers at the end of the processing loop and increase the reliability of results using a post-processing procedure. In order to validate our proposed method, a comprehensive experimental analysis has been done on two datasets. The ﬁrst dataset contains the commonly-used computer vision image pairs on which the state-of-the-art RANSAC-based methods have been evaluated. The second dataset image pairs were captured by a drone over a forested area with various rotations, scales, and baselines (from short to wide). The results show that ELISAC ﬁnds more inliers with a faster speed (lower computational time) and lower error (outlier) rates compared to M-estimator SAmple Consensus (MSAC). This makes ELISAC an effective approach for image matching and, consequently, for 3D information extraction of very high and super high-resolution imagery acquired by space-borne, airborne, or UAV sensors. In particular, for applications such as forest 3D modeling and tree height estimations where standard matching algorithms are problematic due to spectral and textural similarity of objects (e.g., trees) on image pairs, ELISAC can signiﬁcantly outperform the standard matching algorithms.


Introduction
Matching is the process of finding the corresponding points in two or more images of the same area (overlapping images) and is a fundamental step in 3D model generation in photogrammetry and computer vision [1]. Matching is used in many photogrammetric and computer vision applications, such as image registration, triangulation, 3D model and digital surface model (DSM) generation, change detection, target detection, and image mosaicking. Despite the presence of many matching algorithms, the accurate, fast, and highly reliable performance of existing matching processes still has significant limitations due to the complex characteristics of the images used in photogrammetric applications, as well as the requirements for improving the accuracy, speed, and reliability of this process [2][3][4][5][6][7][8]. For example, standard matching algorithms such as the well-known Randomize Random SAmple Consensus (RANSAC) do not provide promising results when applied to UAV imagery over forest areas due to spectral and textural similarities of objects, resulting in a fairly large number of mismatched points (outliers or noise).
Matching is a fundamental step in photogrammetry for the generation of DSM using a set of two or more overlapping images. The traditional photogrammetric procedure is effective when applied to imagery acquired by metric cameras (sensors) onboard aerial or space-borne platforms. Although there are UAVs with metric cameras, most of the UAVs on the market capture imagery with nonmetric cameras, resulting in the ineffectiveness of traditional photogrammetry for applications such as DSM generation. In addition, the super high-spatial resolution of UAV images (i.e., centimetric level) makes the use of conventional photogrammetric processing less effective [9,10]. In general, the DSM generation steps are (1) feature extraction and matching; (2) finding the best match (outlier rejection); (3) triangulation, bundle block adjustment, and sparse point cloud generation; (4) point cloud densification; and, finally, (5) DSM generation. The high number of inliers increases the observations and directly improves the accuracy of triangulation, bundle block adjustment, and sparse point cloud. Therefore, finding more inliers (best matches) is essential for generating a denser and well-distributed sparse point cloud and, consequently, a preciser DSM [11,12].
One of the challenges in processing UAV data for 3D generation is the presence of outliers in the matching step and, as a result, the generation of the sparse and dense points cloud with low accuracy and, finally, the generation of DSM of the scene with low accuracy [13]. Therefore, it is necessary to eliminate the outliers in each stage, including the generation of tie, sparse, and dense point clouds, in order to increase the accuracy and quality of the matching process and, consequently, generate an accurate DSM.
In general, the outlier removal methods can be categorized as handcrafted and deep learning methods [14]. There are various traditional (handcrafted) methods for removing outliers, such as M-estimators, L-estimators, R-estimators, Least Median Squares (LMedS), and Hough transform [15]. One of the most widely used algorithms is RANSAC, which estimates both matched features and outliers [16]. RANSAC is based on an iterative selection of minimal random samples in order to estimate the model parameters [16]. To find correct matches, RANSAC needs a high number of iterations to detect the possible inliers among the existing outliers. In other words, RANSAC first selects an initial set of a random sample (points) and solves the model (collinearity equations or fundamental matrix) parameters; then, it checks the number of inliers and calculates the maximum iteration number (N). If the inliers ratio (the ratio between the number of inliers to the total number of points) is more than the last iteration, N is updated using the new number of inliers. This procedure continues until the iteration number reaches N. However, an important missing part in RANSAC is that it only considers the minimum number of points required to solve the model parameters in all iterations and does not consider the possibility of using new inliers found in each iteration (or early best matches) to update the model and, consequently, to increase inliers before going to the next iteration. This results in a remaining significant number of mismatches (outliers) after the process is completed. This shortcoming, together with other requirements such as enormous memory and computation time requirements and the need for a threshold for error rejection, [15,16] reduces the efficiency of the RANSAC matching outlier rejection method.
Convolutional neural network (CNN) architecture, as a deep learning method, presents a new method of feature detection [17], feature description [18], model design [19], and matching procedure [14]. Deep learning-based methods train neural networks for the matching process, followed by a RANSAC-based loop as an outlier rejection [20]. The difference between traditional and deep learning methods is that the deep learning-based Remote Sens. 2022, 14, 4917 3 of 24 method usually processes all correspondences in one step without generating hypotheses repeatedly [21][22][23][24].
However, in comparison to well-tuned RANSAC-based methods, the deep learningbased methods still need to be further studied in terms of network structure [25], loss function [26], matching metrics [27], generalization ability [28], typical image-matching problems such as large viewpoint changes [29], surface discontinuities [30], shadows [31], and repetitive patterns [14,32]. More importantly, after careful revision, Jin and his colleagues concluded that RANSAC-based methods still outperform the deep learning methods by tuning the proper settings [20,33].
As discussed, deep learning-based methods are followed by RANSAC-based outlier rejection, so it still is important to have a good outlier rejection method to address some limitations of the RANSAC-based methods [35,40,47]. This paper proposes a modified version of RANSAC called Empowered Locally Iterative SAmple Consensus (ELISAC) utilizing three basic enhancements to improve the performance of the RANSAC-based method in terms as follows.
• number of inliers found; • lower number of iterations; • increased convergence rate; • the refinement of the final inlier output (reducing the remaining outliers in the last stage of the loop).
The next section (Section 2) starts by providing an overview of RANSAC-based methods followed by a detailed description of our proposed ELISAC and its three enhancement steps. Section 3 presents the experimental results of applying ELISAC to two datasets, a well-known dataset used in computer vision and UAV images of a forest area, and evaluates and discusses its performance on a dense point cloud, sparse point cloud, and DSM generation against the standard RANSAC algorithm and Agisoft commercial software (Agisoft LLC, Saint Petersburg, Russia) [48]. Finally, Section 4 presents our remarks and conclusions.

An Overview of the RANSAC-Based Methods
Since our proposed method is a modified version of RANSAC, it is essential to understand how RANSAC-based algorithms work to better understand the improvements (enhancements) we are proposing.
RANSAC and its extensions are arguably one of the most common outlier rejection methods in photogrammetry and computer vision. RANSAC is an iterative two-step process. Algorithm 1 shows a pseudocode for RANSAC. In the first step, a small number of random samples (s) are selected to determine the model's parameters (i.e., Collinearity or Coplanarity conditions in photogrammetry and the Essential Matrix or Fundamental Matrix in the computer vision field [49,50]). The interior and relative orientations are performed simultaneously using the collinearity equations (the model) and utilizing at least eight corresponding points [51] in two (or more) overlapping images, which are called tie points [52]. The use of more points increases the degree of freedom and, consequently, the model s geometrical strength. In the second step, the model is tested against the rest of the tie points through a distance function (e.g., Euclidian or Sampson distance) to determine the number of inliers (I), the inlier ratio (e), and a number of iterations (N) using Equations (1) and (2). The inlier ratio is the ratio between the number of inliers and the total number of points (M). Equations (1) and (2) are as follows. ρ is the desired probability of selecting a good sample (all inlier).
Inputs: M: all tie points, s: minimum number of points required to solve the unknown parameters of the model, and θ: a predefined threshold. Output: I global−best : global-best-inliers iteration = 0, I current−best = 0 While iteration < N Select an initial random sample (s points) Generate the hypothesis using the initial sample (collinearity equations) Evaluate the hypothesis (i.e., Euclidian distance for all tie data points (M)) Count the supporting points (I iteration ) If I iteration > I current−best I current−best = I iteration Update N based on the new I current−best (Equation (2)) End If iteration = iteration + 1 End While I global−best = I current−best Re-estimate the Collinearity equations or the Fundamental matrix using I global−best In the next iteration, if I is greater than that of the previous iteration, N is updated using the new I; otherwise, N remains unchanged. This procedure continues until the iteration number reaches N or the inlier ratio (e) exceeds a predefined threshold. It should be noted that N is an adaptive termination (AT) criterion that is updated in each iteration based on e. Finally, the subset with the highest number of inliers is considered the best match points, and the model generated using all inliers is considered the best model.

ELISAC: Empowered Locally Iterative SAmple Consensus
Our proposed ELISAC method improves the MSAC version of RANSAC in three different ways. MSAC is operated similarly to RANSAC, although, in contrast to RANSAC, it considers an interest value for both inliers and outliers to evaluate the hypothesis [35]. First, it increases stability and the number of inliers by introducing two Locally Iterative Least Squares (LILS) loops, i.e., Basic LILS and Aggregated LILS. These two loops are interchangeable. Second, it improves the convergence rate and, consequently, reduces the number of iterations using a Similarity Termination (ST) criterion. Third, it removes any possible outliers and thus increases the reliability of the results using a post-processing procedure. The highlighted boxes in Chart 1 show these proposed improvements. possible outliers and thus increases the reliability of the results using a post-processing procedure. The highlighted boxes in Chart 1 show these proposed improvements.

Locally Iterative Least Squares (LILS) Loop
As described earlier, MSAC uses an initial minimum random sample set of matched points (eight points in our case) to estimate the model parameters (collinearity equations) and evaluate the model against all other matched points to determine the inliers. If the number of inliers found is more than the previous number of best inliers, the current best inliers will be updated (Algorithm 1). However, the algorithm does not consider the inclusion of these early found inliers to estimate and improve the model. To include the early best matches (inliers) in improving the model at each iteration, we propose two types of a Locally Iterative Least Squares (LILS) loop. These loops enhance the performance of the MSAC in terms of stability, the number of inliers found, and the convergence rate.

Basic LILS
Algorithm 2 shows the Basic LILS loop, where all inliers found at each iteration are directly used to solve the unknown parameters of the model (collinearity equations). The parameters are estimated using the least squares solution. In other words, unlike MSAC, where the model is updated using a minimum number of points at each iteration, the Basic LILS utilizes all inliers found at each iteration and applies a least squares solution to improve the model (Algorithm 2). Once the model is updated, it will be applied to all other points to find more inliers at each iteration, and the process continues until the inlier ratio meets the threshold or the number of iterations reaches N (i.e., adaptive termination-AT). The number of inliers found by this method is significantly higher than any initial random sampling procedure (even the uncontaminated samples) in the RANSAC-based methods. After each inner iteration, the AT criterion is updated based on the highest number of inliers found using the local loop. This process is called AT-Basic. Output: I global−best : global-best-inliers iteration = 0, I current−best = 0, I save−best = 0 While iteration < N Select an initial random sample (s points) Generate a hypothesis using the initial sample Evaluate the hypothesis (evaluation procedure against all data points (M)) Count the support data (I iteration ) If I iteration > I current−best I current−best = I iteration Select all inliers (I current−best ) as initial sample Generate a hypothesis (least-squares-based) using the initial sample Evaluate the hypothesis (evaluation procedure against all data points (M)) Count the supporting data (I loop−best ) End While If I current−best ≥ I save−best I save−best = I current−best Else I current−best = I save−best End If Check the ST criterion (optional) and terminate the program if it is satisfied. Update N based on the new I current−best (AT-Basic criterion)

Aggregated LILS
The second proposed local loop, Aggregated LILS, uses a similar procedure as the Basic LILS. The only difference is that it uses the aggregated inliers found in the current and the previous loops in each iteration. It increases the number of inliers and speeds up the overall convergence rate (Algorithm 3). The idea of aggregating was proposed by [53], where they aggregated the best models obtained in each iteration of a local optimization step using a statistical weighting procedure. Our proposed aggregation method simply combines the best inlier set found after each Basic LILS loop without considering any weighting procedure. In this step, the AT criterion is updated based on the highest number of aggregated inliers found using the local loop. This process is called AT-Improved.

Algorithm 3: Aggregated LILS.
Inputs: M: all match points, s: minimum number of points required to solve the unknown parameters of a model, and θ: a predefined threshold. Output: I global−best : global-best-inliers Select an initial random sample (s points) Generate a hypothesis using the initial sample Evaluate the hypothesis (evaluation procedure against all data points (M)) Count the support data (I iteration ) If I iteration > I current−best I current−best = I iteration I loop−best = 0 While (I loop−best > I current−best OR I loop−best = 0) If I loop−best = 0 I current−best = I loop−best End If Select all inliers (I current−best ) as the initial sample Generate a hypothesis (least-squares-based) using the initial sample Evaluate the hypothesis (evaluation procedure against all data points (M)) Count the supporting data (

The Similarity Termination (ST) Criterion
In the standard MSAC algorithm, the process is terminated either when the inlier ratio meets the predefined threshold (θ) or the number of iterations reaches N. Since N is dependent on e, the inlier ratio, the number of inliers (I) found at each iteration directly impacts the termination process. A small I limits the search space and, hence, increases the chance of selecting a local optimum rather than a global one. In contrast, the high number of samples (i.e., when LILS enhancement is applied) increases the computational time irrationally. Therefore, an efficient stopping criterion should have a good balance between local and global searches. To increase the convergence rate and decrease the computational time, we propose an additional termination criterion named Similarity Termination (ST). ST considers the similarity of inlier points between two consecutive iterations. If the similarity is more than 95%, the algorithm will terminate; otherwise, the algorithm will continue until AT or ST meets the threshold. The use of ST does not require any input parameter, which is an essential factor, especially when the inlier ratio is unknown.

Post-Processing Procedure
To further clean up the obtained inliers in previous steps from any possible outlier (Our experimental results show that it is quite common to have remaining outliers in the final result.), we propose a post-processing procedure (PPP) to filter out any remaining outliers. For this purpose, a final outlier rejection process (e.g., Basic LILS) is applied to the inliers found in the final results. Since the inlier ratio (e) is high, N is low, and thus, implementing the PPP step does not add a significant computational burden to the whole process. Once the inliers are inspected and possible outliers removed, the final inliers will be used to estimate the final accurate model of the collinearity equations or the fundamental matrix.

Experiments and Results
The proposed methods are evaluated both quantitatively and qualitatively. The quantitative evaluation is done by comparing the number of inliers and the relative computational time to MSAC. Then, we visualize how our best-performing algorithm improves the quality of the generated DSM. To do so, we compare our generated DSM with a DSM generated with well-known commercial software (Agisoft).
For this study, the Scale-Invariant Feature Transform (SIFT) algorithm was utilized to extract match points for all the images [54]. The collinearity equations with a normalized eight-point model were also used to estimate the epipolar geometry (hypothesis generation) [52,55]. The Sampson distance, 0.3 pixel with a confidence value of 95%, was utilized as an error function between each point and its projection on the image. The Sampson distance defines the squared distance between point x to the corresponding epipolar line [56]. The projection is calculated by applying the epipolar geometry principles.

Dataset
To investigate the potential of the proposed scenarios, first, a freely available dataset [57] was selected. These image pairs have been tested on some of the common state-of-the-art RANSAC-based methods [36,57]. Therefore, we compared our proposed scenarios' output on this dataset in terms of the accuracy, computational time, and the number of inliers generated by our method with MSAC's output. The dataset contains image pairs with rotation, scale, and viewpoint changes ( Figure 1). For these image pairs, the number of SIFT points is given in Table 1. with rotation, scale, and viewpoint changes (Figure 1). For these image pairs, the number of SIFT points is given in Table 1.   Number of SIFT  points  167  418  227  730  157  97 The second dataset consists of image pairs taken by FC330 as a sensor and DJI Phantom 3 as a platform at about a 60-m average flight altitude over an area covered by forest stands (mostly coniferous), clear cuts, roads, and a single building. We purposely picked this area to test the performance of our proposed algorithm in an area containing such land cover types. The parameters of the UAV images are listed in Table 2. Among the acquired UAV images, image pairs with different overlapped, various rotation angle, and different baseline (short to wide) are selected over dense, semi-dense, and sparse forestry areas to assess the proposed enhancements. Figure 2 shows the original images. We compare the accuracy, computational time, and the number of inliers generated by our method with MSAC output. For these image pairs, the number of SIFT points are given in Table 3.  The second dataset consists of image pairs taken by FC330 as a sensor and DJI Phantom 3 as a platform at about a 60-m average flight altitude over an area covered by forest stands (mostly coniferous), clear cuts, roads, and a single building. We purposely picked this area to test the performance of our proposed algorithm in an area containing such land cover types. The parameters of the UAV images are listed in Table 2. Among the acquired UAV images, image pairs with different overlapped, various rotation angle, and different baseline (short to wide) are selected over dense, semi-dense, and sparse forestry areas to assess the proposed enhancements. Figure 2 shows the original images. We compare the accuracy, computational time, and the number of inliers generated by our method with MSAC output. For these image pairs, the number of SIFT points are given in Table 3.    a and b a and c a and d a and e a and f a and g a and h a and

Performance Evaluation
A total of two scenarios based on different enhancements, Basic and Aggregated LILS, were tested on both datasets. The first scenario combined Basic LILS, AT-Basic, STcriterion, and PPP. The second scenario combined Aggregated LILS, AT-Improved, STcriterion, and PPP.
For the performance evaluation of the proposed method, we compared the number of inliers and the computational time for each algorithm with the corresponding values   a and b a and c a and d a and e a and f a and g a and h a and

Performance Evaluation
A total of two scenarios based on different enhancements, Basic and Aggregated LILS, were tested on both datasets. The first scenario combined Basic LILS, AT-Basic, ST-criterion, and PPP. The second scenario combined Aggregated LILS, AT-Improved, ST-criterion, and PPP.
For the performance evaluation of the proposed method, we compared the number of inliers and the computational time for each algorithm with the corresponding values for the MSAC algorithm. To compare the number of inliers, we used the following metrics: the average number of inliers, the minimum number of inliers, maximum number of inliers, and the RMSE of the number of inliers for each image pair. The RMSE was calculated by taking the average RMSE from 100 runs. The relative time ratio to MSAC was reported for the same image pairs. Similar to RMSE, we use the average from 100 runs for the computational time. Tables 4 and 5 summarize the results for each image pairs in both datasets. Figures 3 and 4 visualize the values reported in Table 4, and Figures 5 and 6 visualize the values reported in Table 5.

a and b a and c a and d a and e a and f a and g a and h a and i
Inliers'      a and b, a and c, a and d, a and e, a and f, a and  g, a and h, and a and i).

a and b a and c a and d a and e a and f a and g a and h a and i
Inliers'     a and b, a and c, a and d, a and e, a and f, a and g,  a and h, and a and i).   As the results suggest, both algorithms (Aggregated and Basic-LILS) find more inliers for all the image pairs than the MSAC, at least 10% more inliers. According to [37,58], the number of required samples (related to the inlier ratios) and the number of data points for the hypothesis evaluation are the main factors that affect the speed of the convergence in RANSAC-based methods. It means that decreasing the inlier ratios, as well as increasing the number of data points, will increase the computational time. Tables 4 and 5 demonstrate the Basic and Aggregated-LILS loops improve the convergence rate, as well as the stability and the number of inliers. Tables 4 and 5 show that, for the proposed algorithm, the number of data points has the strongest effect on increasing the computational time. It means that, by increasing the total number of data points (e.g., a and e), the computational time will increase. Additionally, for both scenarios and datasets, the computational time for all the image pairs is less than the MSAC, except for the UAV dataset's first image pair (a and b) of the UAV dataset.  As the results suggest, both algorithms (Aggregated and Basic-LILS) find more inliers for all the image pairs than the MSAC, at least 10% more inliers. According to [37,58], the number of required samples (related to the inlier ratios) and the number of data points for the hypothesis evaluation are the main factors that affect the speed of the convergence in RANSAC-based methods. It means that decreasing the inlier ratios, as well as increasing the number of data points, will increase the computational time. Tables 4 and 5 demonstrate the Basic and Aggregated-LILS loops improve the convergence rate, as well as the stability and the number of inliers. Tables 4 and 5 show that, for the proposed algorithm, the number of data points has the strongest effect on increasing the computational time. It means that, by increasing the total number of data points (e.g., a and e), the computational time will increase. Additionally, for both scenarios and datasets, the computational time for all the image pairs is less than the MSAC, except for the UAV dataset's first image pair (a and b) of the UAV dataset.  As the results suggest, both algorithms (Aggregated and Basic-LILS) find more inliers for all the image pairs than the MSAC, at least 10% more inliers. According to [37,58], the number of required samples (related to the inlier ratios) and the number of data points for the hypothesis evaluation are the main factors that affect the speed of the convergence in RANSAC-based methods. It means that decreasing the inlier ratios, as well as increasing the number of data points, will increase the computational time. Tables 4 and 5 demonstrate the Basic and Aggregated-LILS loops improve the convergence rate, as well as the stability and the number of inliers. Tables 4 and 5 show that, for the proposed algorithm, the number of data points has the strongest effect on increasing the computational time. It means that, by increasing the total number of data points (e.g., a and e), the computational time will increase. Additionally, for both scenarios and datasets, the computational time for all the image pairs is less than the MSAC, except for the UAV dataset's first image pair (a and b) of the UAV dataset. The potential reason could be the huge number of inliers with respect to the MSAC, approximately 25% more. Further, it is observed that the computational time of the Aggregated-LILS with AT-Improved criteria is less than the computational time for the Basic-LILS with AT-Basic criteria.

a and b a and c a and d a and e a and f a and g a and h a and i
It can be observed that all the proposed scenarios behave approximately in a similar manner in terms of the standard deviation of the inliers. In conclusion, the standard deviation of the inliers for all the proposed scenarios is significantly lower than MSAC, which shows both Basic and Aggregated-LILS find approximately similar results in different implementations, disregarding the number of outliers/inliers and type of baselines.

Point Cloud and DSM Comparison
In order to compare the point cloud and the DSM, we use Aggregated-LILS, as it performs slightly better than Basic-LILS. We chose three different overlapping image pairs over dense, semi-dense, and sparse forestry areas to assess the proposed method. Figure 7 shows the three selected UAV image pairs. To perform the point cloud and DSM comparison, we used commercial Agisoft. The purpose of this study was relative comparison; therefore, the generated products aligned with each other before each assessment using CloudCompare software (Électricité de France, Paris, France).
The potential reason could be the huge number of inliers with respect to the MSAC, approximately 25% more. Further, it is observed that the computational time of the Aggregated-LILS with AT-Improved criteria is less than the computational time for the Basic-LILS with AT-Basic criteria.
It can be observed that all the proposed scenarios behave approximately in a similar manner in terms of the standard deviation of the inliers. In conclusion, the standard deviation of the inliers for all the proposed scenarios is significantly lower than MSAC, which shows both Basic and Aggregated-LILS find approximately similar results in different implementations, disregarding the number of outliers/inliers and type of baselines.

Point Cloud and DSM Comparison
In order to compare the point cloud and the DSM, we use Aggregated-LILS, as it performs slightly better than Basic-LILS. We chose three different overlapping image pairs over dense, semi-dense, and sparse forestry areas to assess the proposed method. Figure 7 shows the three selected UAV image pairs. To perform the point cloud and DSM comparison, we used commercial Agisoft. The purpose of this study was relative comparison; therefore, the generated products aligned with each other before each assessment using CloudCompare software (Électricité de France, Paris, France).   Figure 8 shows the sparse point clouds generated by our proposed method and commercial Agisoft for the selected image pairs. Additionally, Table 6 summarizes the number of inliers in the sparse point clouds generated by each method.

Method
Number  Figure 8 shows the sparse point clouds generated by our proposed method and commercial Agisoft for the selected image pairs. Additionally, Table 6 summarizes the number of inliers in the sparse point clouds generated by each method.   Figure 8 shows the sparse point clouds generated by our proposed method and commercial Agisoft for the selected image pairs. Additionally, Table 6 summarizes the number of inliers in the sparse point clouds generated by each method.

Method
Number (e) (f) As is shown in Figure 8 and Table 6, the proposed method generates a denser sparse point cloud compared to Agisoft. This was expected, as our method detects more inliers in all the image pairs.
The results after densification are also shown in Figure 9. As is shown in Figure 8 and Table 6, the proposed method generates a denser sparse point cloud compared to Agisoft. This was expected, as our method detects more inliers in all the image pairs.
The results after densification are also shown in Figure 9. As is shown in Figure 8 and Table 6, the proposed method generates a denser sparse point cloud compared to Agisoft. This was expected, as our method detects more inliers in all the image pairs.
The results after densification are also shown in Figure 9. These point clouds are cropped around the edges to avoid the effect of perspective distortions on the edges of images. The densification of the point clouds based on the proposed procedure is implemented in MATLAB. Due to the memory limitations of MATLAB, our generated dense point clouds have fewer points compared to the dense point clouds of Agisoft. (e) (f) For a better visualization of the difference between the generated dense point clouds, we subtracted the corresponding point clouds using CloudCompare v2.9.alpha (64-bit) software (Électricité de France, Paris, France) based on the Iterative Closest Point (ICP) method. This comparison could also be used to show the elevation differences between two point clouds. Since both dense point clouds are generated using the same images, this difference is used to highlight the areas that are modeled based on the proposed method but not with Agisoft. In Figure 10, similar areas are demonstrated in blue. The visual comparison of both point clouds shows that they match with high percentages in most areas (blue dots in Figure 10). The most significant difference between the two point clouds can be observed in forestry areas, where some single trees were detected in the proposed method but not in the Agisoft point cloud (shown in green, yellow, and red dots in Figure  10). For a better visualization of the difference between the generated dense point clouds, we subtracted the corresponding point clouds using CloudCompare v2.9.alpha (64-bit) software (Électricité de France, Paris, France) based on the Iterative Closest Point (ICP) method. This comparison could also be used to show the elevation differences between two point clouds. Since both dense point clouds are generated using the same images, this difference is used to highlight the areas that are modeled based on the proposed method but not with Agisoft. In Figure 10, similar areas are demonstrated in blue. The visual comparison of both point clouds shows that they match with high percentages in most areas (blue dots in Figure 10). The most significant difference between the two point clouds can be observed in forestry areas, where some single trees were detected in the proposed method but not in the Agisoft point cloud (shown in green, yellow, and red dots in Figure 10). For a better visualization of the difference between the generated dense point clouds, we subtracted the corresponding point clouds using CloudCompare v2.9.alpha (64-bit) software (Électricité de France, Paris, France) based on the Iterative Closest Point (ICP) method. This comparison could also be used to show the elevation differences between two point clouds. Since both dense point clouds are generated using the same images, this difference is used to highlight the areas that are modeled based on the proposed method but not with Agisoft. In Figure 10, similar areas are demonstrated in blue. The visual comparison of both point clouds shows that they match with high percentages in most areas (blue dots in Figure 10). The most significant difference between the two point clouds can be observed in forestry areas, where some single trees were detected in the proposed method but not in the Agisoft point cloud (shown in green, yellow, and red dots in Figure  10). The generated dense point clouds are then used to produce the DSM using ArcGIS 10.5. The results are demonstrated in Figures 11-13. In these figures, the trees that are detected by our method, but not with Agisoft, are highlighted with red circles. Additionally, the DSM differences, as well as the profile graphs, are demonstrated. These results show the impact of detecting more inliers in achieving a more detailed point cloud and, consequently, a better DSM in the forestry areas. As can be seen in Figures 11b,c, 12b,c and 13b,c, several trees are detected by our method but not by Agisoft. The generated dense point clouds are then used to produce the DSM using ArcGIS 10.5. The results are demonstrated in Figures 11-13. In these figures, the trees that are detected by our method, but not with Agisoft, are highlighted with red circles. Additionally, the DSM differences, as well as the profile graphs, are demonstrated. These results show the impact of detecting more inliers in achieving a more detailed point cloud and, consequently, a better DSM in the forestry areas. As can be seen in Figure 11b     The visual comparison of both surfaces (i.e., DSMs) generated from the two methods shows that the proposed procedure is able to detect a more detailed DSM. This comparison shows that some trees are absent from the DSM that is generated by Agisoft. However, the generated DSM based on the proposed method is smoother than that generated using Agisoft. It is because of the lower density of the generated dense point cloud based on the proposed method. As mentioned earlier, this is due to the memory limitations of MATLAB. Therefore, the generated DSM based on the proposed procedure is smoother Figure 13. The image captured at the desired area (a), the generated DSM by Agisoft (b), as well as the proposed method, (c) are also shown from the third dataset. The subtraction of two DSMs, in addition to the profile graph, is also demonstrated (d).
The visual comparison of both surfaces (i.e., DSMs) generated from the two methods shows that the proposed procedure is able to detect a more detailed DSM. This comparison shows that some trees are absent from the DSM that is generated by Agisoft. However, the generated DSM based on the proposed method is smoother than that generated using Agisoft. It is because of the lower density of the generated dense point cloud based on the proposed method. As mentioned earlier, this is due to the memory limitations of MATLAB. Therefore, the generated DSM based on the proposed procedure is smoother than those