Photogrammetric Methodology for the Production of Geomorphologic Maps: Application to the Veleta Rock Glacier

In this paper we present a stereo feature-based method using SIFT (Scale-invariant feature transform) descriptors. We use automatic feature extractors, matching algorithms between images and techniques of robust estimation to produce a DTM (Digital Terrain Model) using convergent shots of a rock glacier.The geomorphologic structure observed in this study is the Veleta rock glacier (Sierra Nevada, Granada, Spain). This rock glacier is of high scientific interest because it is the southernmost active rock glacier in Europe and it has been analyzed every year since 2001. The research on the Veleta rock glacier is devoted to the study of its displacement and cartography through geodetic and photogrammetric techniques.


Introduction
The study of rock glaciers in high-mountain zones requires the production of detailed maps for the analysis of their evolution and behaviour through time.Using geomatic techniques (geodesy, global positioning system, photogrammetry) it is possible to study the dynamics of geomorphologic structures with high precision.In order to obtain 3D reconstructions in rock glaciers, conventional photogrammetry (metric camera and normal shots) and cheaper methods (semi-metric cameras and convergent photographs) can be used [1][2][3][4], but we aim for a more automatic method.
Obtaining detailed maps (cartography) in rock glaciers is difficult, because the use of flights to obtain photogrammetric images in these areas is expensive.Our aim is to develop a quick and cheap method to obtain enough 3D points to create a detailed DTM.Thus, two methodological goals are considered:  Cartographic representation of the rock glacier, for example at a scale of 1/1,000 with a margin of error in the points of less than ±20 cm.This is the main objective of this study. Determination of the glacier dynamics through the time.Currently, we are working in this new methodology.
To fulfill these goals we use computer vision techniques to automatically detect as many points as possible of the natural environment which is specific to our example Veleta rock glacier.The application of computer vision techniques in this field has the following objectives:  The creation of a Digital Terrain Model (DTM) with enough points in order to obtain a geomorphic map of the Veleta rock glacier. Replacing costly photogrammetric flights using planes with less expensive photographs taken from the top of the surrounding mountains, or using balloons or hang-gliders. Elimination of extensive office work by a human operator in obtaining 3D points or contour lines by replacing photogrammetric restitution with an automatic method, which also eliminates human errors.

Characteristic of the Veleta Rock Glacier
The Veleta rock glacier is located in "Sierra Nevada" (37°N-3°W) in a cirque at the bottom of Mount "Veleta" (3,398 m. a.s.l.).Sierra Nevada is a National Park at the south of Spain (Figure 1a).This mountain is very vulnerable to gelifraction which results in much more rock material on the Veleta rock glacier.The Veleta rock glacier forms an open figure "╗" (Figure 1b).Its source is attached to the wall of the "Corral" and continues in a northern direction and later extends to the west direction.It has an average slope of 20º, with the front at 3,090 m. a.s.l. and the root at 3,175 m. a.s.l.It is a tongue-shaped rock glacier 35 m wide and 109 m long [5].

Geomatic Methods
Several geomatic techniques can be employed to study any kind of rock glacier: Interferometry radar, laser scanner, geodesy (angle and distance measurements), levelling (geometric, trigonometric), Global Positioning System (GPS), photogrammetry (aerial, terrestrial, close range) [6][7][8].In the case of Veleta rock glacier, the geodetic techniques (high precision total station), GPS and close range photogrammetry have been used [1].The orography around the Veleta rock glacier (300 m high walls) and the cost make the application of some geomatic techniques difficult.Some authors have used geodetic techniques and aerial photogrammety to obtain maps or to measure the glacier dynamics [3,[9][10][11].The advantage of the photogrammetry (aerial, terrestrial) over the geodesy is that photogrammetry can show the position of many points (photographic information), and it makes collecting information in the field much faster than geodesy [12,13].

Geodetic Survey and GPS
On the surface of the Veleta rock glacier, 20 control points were installed to monitor the glacier dynamics.These control points (iron rods) are measured every year to observe glacier dynamics and are also used for the photogrammetric process.
High precision topographic measurements were performed using a total station and a GPS.The total station is placed on the stable lateral moraine of the Veleta rock glacier and the maximum distance between the survey stations and the control points is 180 meters.The GPS method used is "RTK static".So, a receiver is fixed on a point of known coordinates (X,Y,Z) and the other receiver is successively placed at the iron rods for 1 minute to ensure that every point has a spatial precision or GDOP less than 3 cm [14].The control points for photogrammetry are the same points used for the monitoring of the dynamics of the rock glacier in order to reduce the time of data collection (Figure 2a).

Close range Photogrammetry (Conventional Photogrammetry)
A calibrated camera should be used to apply a photogrammetric general method and then the principal point, the focal length, the grid reseau coordinates and the lens distortion are known.In the general aerial photogrammetric method, the cameras are placed in the normal configuration in a vertical direction in such a way that the optical axes of both photographs are parallel and perpendicular to the baseline (Figure 3a).This can be done with a plane flight but it cannot be done by hand in the Veleta rock glacier.Here, the operator is on the Veleta peak with the camera in hand on an uneven surface (Figure 2b).The images obtained have a convergent geometry (Figure 3b), and we take advantage of the epipolar geometry concept to deal with these convergent photographs.We have taken several photographs from which we have selected pairs to carry out the photogrammetric process using a stereoplotter.
The photogrammetric method of restitution to obtain the internal, relative and absolute orientations was difficult and it took several trials [15,16].This is because the traditional restitution method is designed to work with image pairs in normal or quasi-normal configuration with parallel optical axis.
In summary, the general photogrammetric method used to obtain the map of the Veleta rock glacier involved the following issues:  Method of photogrammetry with analytical plotter "Leica SD-2000".
 The final result of accuracy in orientation was 6 cm planimetry (X, Y) and 8 cm altimetry (Z).
 The map was made from the stereoscopic orientation of the image pair through contour lines with the PRO600 software (Figure 4).

Computer vision for Photogrammetric Reconstruction
The goal of the photogrammetric method presented in this work is to obtain automatically a 3D model of a geomorphologic structure.This method develops preliminary work presented in the Sixth International Conference on Geomorphology [17].The methodology proposed is supported by means of techniques developed in the field of Computer Vision.The input requirements of the method are: a set of images of the geomorphologic structure taken from different points of view and the camera calibration (focal length, principal point and radial distortion).For the Veleta rock glacier example we work using a digital camera (Canon EOS 5D with 35 mm lens).In this section we present the different method steps which are summarized as follows:  Extraction features: Extraction of point features of each image using the SIFT extractor [17,18]. Matching of points (between image pairs). Robust matching through the epipolar geometry: The epipolar geometry (a geometric constraint) of the image pair is computed by using robust statistic techniques.Matches that do not hold the epipolar constraint are removed as mismatches. Camera processing (pose estimation): We know the internal parameters of camera (focal lenght, principal point, radial distortion) and we estimate the rotation and translation of the cameras. DTM generation: This step requires the camera projection matrices and a triangulation process to obtain a 3D model of the geomorphologic structure.

Extraction of Features
In the first step, feature points are extracted from each image.The more usual image features are points [19], or lines [20].Here we choose point features because they are more easily extracted in nonstructured environments.We use DoG (Difference of Gaussian) detector and SIFT descriptor to extract a set of point features [18].
The regions extracted with DoG detector are described with a vector of dimension 128 and the descriptor vector is divided by the square root of the sum of the squared components to get illumination invariance.The descriptor is a 3D histogram of gradient location and orientation.SIFT descriptors are very interesting because they are invariant to scale and rotation and have good properties with regard to repeatability, distinctiveness, and robustness, and then, SIFT has shown to be very useful for point matching purposes.For example, we can distinguish the same point in several images even with big scale and orientation changes.In this way we obtain a set of point features for each image of stereo pair using SIFT descriptors, which we use for find pair of corresponding points.
Although there are other descriptors or methods it was demonstrated with different measures that the SIFT descriptors are superior to most others.Several works have compared existing descriptors and concluded that SIFT is a good feature point extractor method in a general use or in photogrammetric applications [21].

Matching of Points
The SIFT method provides a set of image locations and descriptors.We will use this information for matching points process.We obtain a list of image 1 descriptors and a list of image 2 descriptors that form a database of keypoints.The best candidate for a matching is his closer keypoint neighbor in database, that is to say, the keypoint with lowest euclidean distance respect the correspondent descriptor.Nevertheless, many of the image points do not have an equivalent matching point.Thus, it is necessary a way to remove points whithout a good matching in database.
A priori, the more successful solution is to establish a threshold for the distance to the closer descriptor and even better to compare the distance between the closer neighbour and second closer neighbour.Using this method we remove 90% of wrong matched points and we loose only 5% of correct ones [18].As we can see in the next sections, SIFT matched points are very good for obtaining an accurate DTM of rock glacier area.

Robust Matching through the Epipolar Geometry
In this step the epipolar constraint is exploited to remove the wrong matches which still remain.Now, we briefly introduce some concepts related to the epipolar geometry [22] and how it can be used to perform a robust matching [23].
An image taken by a camera is the projection of a 3D scene in a 2D plane.Then, for each 3D point P = (X,Y,Z) we have a projected point in the image plane with coordinates p 1 = (u 1 ,v 1 ,1) in the first image and p 2 = (u 2 ,v 2 ,1) in the second.The epipolar geometry represents the intrisic geometry between two views.It is independent of the scene structure and only depends on the relative localization between the cameras (rotation and translation) and the internal camera calibration.The baseline is defined as the straight line that joins the optical centers of the cameras.The epipoles (e 1 ,e 2 ) are the intersection points of the baseline with the image planes.The fundamental matrix F is the algebraic representation of the epipolar geometry and it satisfies the so called epipolar constraint: Therefore, each pair of matched points (p 1 ,p 2 ) must hold this constraint; otherwise it is a wrong match.Each correspondence gives one equation through the epipolar constraint.The fundamental matrix is a 3 × 3 matrix and therefore it can be computed up to scale by solving a linear system from a minimum set of 8 points matches [22].Nevertheless we must assume that outliers and mismatches are present in the set of point matches.Thus, the fundamental matrix is computed iteratively with the RANSAC method [24], which is robust in presence of outliers.The result of this step is the fundamental matrix relating the pair of images.The point matches that support this fundamental matrix are considered good correspondences and the rest of matches are discarded as wrong matches.

Camera Processing (Pose Estimation)
We use a previously calibrated camera (focal length, principal point, radial distortion).Therefore if we want to get the complete camera system we need to calculate the exterior orientation (rotation and translation of cameras or camera pose).We use a nonlinear method to calculate the exterior orientation: The bundle adjustment [25].The purpose of bundle adjustment is to get an optimization to minimize the reprojection error through a Gradient Descent solution.We calculate an inicialization using DLT (Direct Linear Transform) and use an iterative approximation to minimize reprojection error.In this bundle adjustment process we use a set of control points, that is to say points with known 3D coordinates (Figure 5).Although bundle adjustment could be used to obtain a self-calibration, we only use it for exterior orientation process (pose camera estimation) since we know camera calibration.

DTM Generation
In this step the 3D reconstruction of the scene is performed to produce the DTM from the set of projection camera matrices.We know the internal camera parameters and the exterior orientation (pose estimation) from the camera processing step in section 4.4, therefore we have now a calibrated system.
As we know the camera projection matrices, the 3D structure may be recovered by triangulation [22].A simple scheme is proposed using a linear triangulation method to recover the 3D structure [26].This scheme is based in the fact that the image points have noise and therefore the rays back-projected from the noisy matches in the images are skew in the space.Since the rays do not intersect in general in just one point, the measured points do not satisfy exactly the epipolar constraint.Then, triangulation by back-projecting rays from the matches will fail.This problem is solved if the match points are corrected by minimizing a cost function which represents errors in the image.Once the points are corrected, a linear triangulation method can be carried out to obtain the 3D structure.This 3D structure can be improved using bundle adjustment process as we do for pose estimation.At the end of the triangulation we get a cloud of 3D points.The last step is to convert this 3D point cloud in a surface using a Delaunay triangulation.

Results and Discussion
We test the proposed methodology in Veleta rock glacier using 2008 campaign images.As we explain in previous sections we obtained these images using a digital camera, specifically a Canon EOS 5D with a 35 mm lens.We can use the proposed methodology with all the possible pair combinations, but the results are better when we use a pair of camera poses with enough baseline.In order to show the obtained results we follow the steps of our methodology working with a pair of images.We use images 1 and 5 with a baseline of 31.96m:  Extraction of features (SIFT).After using SIFT method in images 1 and 5 we get 136,232 keypoints and 94,775 keypoints respectively.As we explain in a previous section these keypoints are invariant to scale, pose, and changes of ilumination. Matching of points.The next step is the comparison of the two groups of keypoints.In this case we set a threshold of 0.6 for the distance between descriptors and we obtain 17,542 matching locations between image 1 and image 5. Figure 6 represents points (blue crosses) which have a matched point in the other image.Each represented point in left image has an equivalent point in the right image and vice versa. Robust matching through epipolar geometry.We use RANSAC method to calculate matrix F with SIFT matching points.The outliers percentage is very low, in this case there are two outliers and 15,540 inliers and a reprojection error average of 0.0594 pixels and a maximum of 3.36 pixels.In this case RANSAC use is not decisive, but it could be important in other examples to get robust matching. Camera processing.We show in Table 1 the reprojection error (in the control points E1, E2, ..., E9) obtained computing the cameras (P1, P2, ..., P9) using bundle adjustment method.In Figure 7 we can observe the camera poses respect the rock glacier.0.50  DTM generation.Once we know the camera projection matrices we triangulate corresponding points to get a 3D reconstruction.We used the method propose in Section 4.5, getting a reprojection error average of 1.40 pixels and a maximun of 4.50 pixels.Therefore we have 15,540 points (X,Y,Z) of the rock glacier area which we can convert in a surface using a Delaunay triangulation (Figure 8).9).We use cross sections of our reconstructed surface (15,540 points in a Delaunay triangulation surface) to calculate the obtained accuracy.We compare our cross sections with an accurate reconstruction (cross sections obtained using a total station Topcon GTS-502E).As we can see the results are acceptable (errors < 25 cm in height and planimetry) and therefore using this methodology we can get accurate surfaces (Figure 9).We have developed necessary code using Matlab software for method implementation and we have used Photomodeler and Topcon Image Master for internal camera calibration.

Conclusions
We propose an automatic technique for rock glacier reconstruction using digital images.A significant amount of work in feature matching techniques has been reported in computer vision in recent years improving old techniques.From the results given by our methodology matching process we can conclude that the set of matches obtained is very good.Using this kind of techniques the post processing cost and time needed to analyze image information of geomorphologic structures can be dramatically reduced avoiding the arduous task of manual photogrammetric restitution.
Usually the images are taken with the same camera but the proposed methodology presents the advantage that it works the same using several types of camera, being each one defined by its own calibration.Thus, the cartography will be automatically produced even if the photographs are taken with different cameras or their external geometry includes variations in orientation or scale.Besides, the non-metric basis of the fundamental matrix used for the robust matching step makes possible to work with different and uncalibrated cameras for obtaining the final set of point matches.
Another advantage of the proposed method is that it deals with convergent images to produce a DTM of rock glaciers.It allows one to work with different types of cameras with acceptable results.The results quality also depends on the camera configuration in the acquisition phase, where large image base pairs with convergent views produce better results.
The presented method offers several advantages for the study of geomorphologic structures in terms of cost and time.Thus, we do not need flights of planes to obtain the photographs.The geodetic data are obtained in short time and we avoid the manual photogrammetric restitution in the office.Using this methodology points are automatically matched between images obtaining that nearly 100% of the matches are correct.Therefore, it is a robust method for the automatic 3D restitution to be used in the production of geomorphologic maps.In order to show the validity of the methodology experiments with different pairs of images of the rock glacier have been performed and the most representative are shown.Future work is focussed toward dense matching and also on extending this method to images taken at different periods of time to study the evolution of the rock glacier.The objective is to have a cheap and fast tool for studying the evolution of the rock glacier along the time, which is our current research goal.

Figure 3 .
Figure 3. (a) Geometry of normal photogrammetry; (b) Geometry of convergent images.O 1 and O 2 are the centres of projection of the photographs.

Figure 6 .
Figure 6.SIFT matched points denoted with crosses between two images of the Veleta rock glacier area.

Figure 7 .
Figure 7. Camera pose estimation (front view and top view).

Figure 8 .
Figure 8. Example of Delaunay triangulation of the rock glacier (the view is oriented as in Figure9).

Figure 9 .
Figure 9.Comparison of our reconstructed surface cross sections (black polylines) and total station cross sections (blue polylines) and their situation in a Veleta rock glacier sketch.

Table 1 .
Camera system accuracy in the year 2008.