Target Localization for Autonomous Landing Site Detection: A Review and Preliminary Result with Static Image Photogrammetry

: The advancement of autonomous technology in Unmanned Aerial Vehicles (UAVs) has piloted a new era in aviation. While UAVs were initially utilized only for the military, rescue, and disaster response, they are now being utilized for domestic and civilian purposes as well. In order to deal with its expanded applications and to increase autonomy, the ability for UAVs to perform autonomous landing will be a crucial component. Autonomous landing capability is greatly dependent on computer vision, which offers several advantages such as low cost, self-sufﬁciency, strong anti-interference capability, and accurate localization when combined with an Inertial Navigation System (INS). Another signiﬁcant beneﬁt of this technology is its compatibility with LiDAR technology, Digital Elevation Models (DEM), and the ability to seamlessly integrate these components. The landing area for UAVs can vary, ranging from static to dynamic or complex, depending on their environment. By comprehending these characteristics and the behavior of UAVs, this paper serves as a valuable reference for autonomous landing guided by computer vision and provides promising preliminary results with static image photogrammetry.


Introduction
Drones, or Unmanned Aerial Vehicles (UAV), have a wide range of uses, from military applications to domestic uses, due to their ease of control, maneuverability, costeffectiveness, and lack of pilot involvement.Certain drone operations, such as resource exploration and extraction, surveillance, and disaster management, demand autonomous landing as human involvement in these operations may be dangerous, particularly if communication is lost.As a result, recent developments in drone technology have focused on replacing human involvement with autonomous controllers that can make their own decisions.
Launching a drone autonomously is relatively straightforward; the challenging aspect is landing it in a specific location with precision in case of emergency situations.For an autonomous landing to be successful, the drone must have accurate information regarding its landing position, altitude, wind speed, and wind direction.Armed with these data, the UAV can make adjustments to its landing approach, such as reducing speed or altering its altitude, to ensure a safe and successful landing.The landing area can be classified into three categories: static, dynamic, and complex.Static locations are those that are firmly fixed to the ground, such as helipads, airports, and roads.Dynamic locations, on the other hand, are landing areas that are positioned on moving objects, such as a helipad on a ship or drone landing areas on trucks or cars.Complex landing areas are those that have no markings on the surface and can pose a challenge, such as areas near water bodies, hilly regions, rocky terrain, and areas affected by natural disasters such as earthquakes and floods.Figure 1 gives a view of these types of landing areas.This work presents a significant contribution in the form of a comprehensive review of how photogrammetry can be applied to static imaging to enhance the accuracy of control decisions in autonomous landing.In addition, the research includes a useful analysis of various landing approaches, including those in static, dynamic, and complex areas.The analysis provides useful information about the possible uses of photogrammetry in combination with a vision-based landing system for accurately locating a target while a UAV lands autonomously.Numerous research endeavors have been dedicated to the topic of UAV autonomous landing, as it presents a range of challenges, including real-time processing, limited resources, high maneuverability requirements, and precise localization difficulties.These challenges will be further discussed in subsequent chapters.
In Section 2, a literature review is conducted, focusing on the various types of landing systems used in UAVs.This includes a discussion of the Electromagnetically Guided Landing System, Inertial Navigation System, and Vision-Based Landing System.Section 3 explores the different types of landing areas, including Static Landing Areas, Dynamic Landing Areas, and Complex Landing areas.In Section 4, the focus shifts to landing on extra-terrestrial bodies, with a discussion of Pose Estimation Techniques and Object Classification.Section 5 is dedicated to the topic of finding the altitude of the image, with a discussion of Structure from Motion and Photogrammetry.The latter is further divided into four subcategories: Nadir Photogrammetry, Convergent Photogrammetry, Low Oblique Photogrammetry, and High Oblique Photogrammetry.Section 6 presents the preliminary work and results of the study, with a discussion of the performance of standard filters and feature-matching techniques.Finally, the study is wrapped up in Section 7, which presents a summary of the main discoveries, addresses the study's constraints, and offers suggestions for potential research endeavors.

Literature Review
The development of unmanned aerial vehicle technology dates back to ancient times.In fact, the concept of UAVs existed before the invention of manned flight.According to Chinese historical records, around 200 AD, the Chinese attached oil lights to paper balloons to warm the air and make the balloons float.As the balloons hovered above their rivals at night, the enemy troops were frightened and believed that a divine force was behind the flight, and this incident marks the first recorded flight in human history.In the late 90s, the British tested a radio-controlled aircraft from the First World War called "The Aerial Target" in March 1917, which marked the first recorded instance of a drone flying under controlled conditions [1].Following this breakthrough, UAV technology experienced rapid growth.As the technology reached its saturation point within a few decades, researchers shifted their focus from developing various types of UAVs to autonomous flight.Autonomous landing plays a critical role in autonomous flight and, to this day, research on autonomous landing continues to progress and evolve the field of UAV technology.The pre-existing system for the landing process of a drone is next discussed.

Electromagnetically Guided Landing System
Electromagnetically Guided Landing Systems rely on the use of electromagnetic fields to guide aircraft during the approach and landing phase.These fields are generated by ground-based transmitters and received by antennas on the aircraft.The system provides precise information about the aircraft's position, altitude, and velocity, allowing for highly accurate guidance during the landing phase.There are several types of electromagnetically guided landing systems, including Local Area Augmentation Systems (LAAS), Global Navigation Satellite System Landing Systems (GLS), Pulsed Localizers, and Millimeter-Wave Landing Systems (MWLS).Each system has its own unique advantages and disadvantages, and the choice of system will depend on factors such as the airport location, aircraft type, and operational requirements.

Inertial Navigation Systems
The Inertial Navigation System (INS) is a self-sufficient navigation mechanism that employs gyroscopes and accelerometers to monitor the three-dimensional motion of an aircraft.The accelerometers measure the aircraft's acceleration, while the gyroscopes measure the rate of change in the aircraft's orientation.These data are integrated over time to determine the aircraft's velocity and position relative to its starting point.INS is typically classified into two types: strap-down INS and gimballed INS.Strapdown INS is a more modern, lightweight design that directly measures the acceleration and angular velocity of the aircraft, while gimballed INS uses a rotating platform to maintain a stable orientation.Despite their high accuracy, INS has some limitations, including errors caused by sensor drift over time and the need for periodic calibration.INS is widely used in aviation, particularly in navigation, flight control systems, and landing systems.Its ability to provide accurate and reliable navigation information makes it an essential tool for pilots and aircraft manufacturers.

Vision-Based Landing System
Vision-based landing systems are a popular approach to autonomous landing that rely on cameras and computer vision algorithms to detect and recognize landmarks on the ground.The system acquires images of the landing region and proceeds to extract features from them, which are then utilized to identify the landing area.The extracted features consist of high-level visual attributes such as texture, edges, corners, and patterns, as well as low-level features such as colour and brightness.After the feature extraction, the system matches the features against a pre-existing database of features to recognize the landing area.While vision-based landing systems can be highly effective, they require sophisticated algorithms and powerful processing hardware to operate in real-time.Researchers are constantly developing faster and more efficient approaches to improve the speed and accuracy of these systems.Despite their effectiveness, these methods take a long time to extract objects and localize them, which leads the researchers to develop faster and more efficient approaches.
Several techniques and tools have been created for object identification and localization in images.The following list highlights some of the frequently utilized approaches: Iiyama et al. [2] utilized reinforcement learning to localize the object, which involves intercepting the map obtained from techniques such as Digital Elevation Model (DEM) and Light Detection and Ranging (LiDAR) using an auto-encoder.Meanwhile, Skinner et al. [3] utilized the Uncertainty Aware Learning Bayesian and SegNet techniques to enhance the precision of the pixels selected for the landing site.This was achieved by integrating network uncertainty into the final safety map, resulting in the optimal output.Similarly, Minghui et al. [4] used BboxLocate Net, Kalman filter estimator, and Point Refine Net for object classification and target localization.BboxLocate Net is a network designed to recognize the bounding boxes of a target and obtain its coordinates.The data obtained from this process can be used in conjunction with the extended Kalman filter to improve the precision of spatial localization.Finally, PointRefine Net is utilized to further enhance decision accuracy.Moreover, in their study, Yu et al. (2018) [5] employed various computer vision components such as DTM, DEM, DSM, and PSPNet network layers in combination with a MobileNet V2 base feature extractor to detect and localize objects.The authors also utilized YOLO V2 for identifying potential landing areas.
In their study, Bickel et al. (2021) [6] utilized three different methods-HORUS, DestripeNet, and PhotonNet-for object localization and classification.HORUS employs a physical noise model of the Narrow Angle Camera and environmental data to eliminate noise from the CCD and phone noise.The process involves using two deep neural networks sequentially to extract features and enhance data accuracy.Meanwhile, in the study by Ciabatti et al. (2021) [7], the classification task was achieved using Deep Reinforcement Learning and Transform Learning, specifically DDPG.Using the Bullet/PyBullet library, the authors developed a physical environment in which they defined a lander via the standard ROS/URDF framework.They utilized 3D terrain models obtained from official NASA 3D meshes from different missions to add realism to the simulation.An outline of the current approach for object classification and localization is provided in Table 1.
Table 1.Type of pre-existing approach for classification and localization of an object.

Paper
Model Used Key Take Away [2] Reinforcement Learning Camera: DEM and LiDAR.The process of Reinforcement Learning involves intercepting the map obtained through techniques such as Digital Elevation Model (DEM) and Light Detection and Ranging (LiDAR) using an auto-encoder, which then outputs the parameters.
[3] Uncertainty-aware learning Bayesian and SegNet By including network uncertainty into the final safety map, the accuracy of the pixels that are considered for the landing site can be improved with the chosen network.
[4] BboxLocate Net, Kalman filter estimator, Point Refine Net The BboxLocate Net, a network that detects the bounding box of the target, is created to extract the target's coordinates.The extended Kalman filter is then utilized in conjunction with these data to refine the spatial localization accuracy.The PointRefine Net is subsequently applied to further improve the decision-making accuracy. [

5] Computer Vision module
To detect potential landing sites, the approach involves utilizing DTM, DEM, and DSM.For identifying visual landing sites, PSPNet network layers are combined with a MobileNet V2 base feature extractor.Meanwhile, YOLO V2 is used for object detection.
[6] HORUS, DestripeNet, PhotonNet In the HORUS approach, the physical noise model of the Narrow Angle Camera (NAC) and environmental information are utilized to eliminate the noise from the Charge-Coupled Device (CCD) and phone.The method involves using two deep neural networks in a sequence to extract features with a consistency of 3 to 5 m from summed and regular mode LRO NAC images.This aids in the elimination of various noise sources and enhances the precision of the outcomes.To create a realistic physics simulator, the approach employs the Bullet/PyBullet library, which includes a lander defined using the standard ROS/URDF framework, as well as authentic 3D terrain models.These terrain models are generated by adapting official NASA 3D meshes derived from data gathered during numerous missions, ensuring an accurate representation of reality.
Aside from the approaches mentioned earlier, there are other algorithms available for the purpose of classification and localization.Various research studies have delved into several deep learning techniques, such as Edge Detection, Alex Net, ERU Net, YOLO, DPM, R-CNN, Over Feet, U-Net, and Structured Random Forest [8][9][10][11][12][13][14][15][16].Unsupervised Learning techniques such as Hough Transformation and its enhanced algorithm, Genetic Algorithms, Hough Transformation and Radial Consistency approach, Template Matching, and Morphological Image Processing have been discussed [17][18][19][20][21]. Kang et al. (2018) [22], Xin et al. (2017) [23], and Urbach et al. ( 2009) [24] introduced several Supervised Learning techniques for object classification and localization, which include Support Vector Machine, Ada Boost, and Notation of Looking for Perspective.The KLT detector discussed in [25] employs a combined approach of Supervised and Unsupervised Learning, while Template Matching, which is a combination of Supervised and Deep Learning approaches, was discussed in [26].

Type of Landing Areas
The categorization of target locations is divided into three types: Cooperative-based landing pertains to landing sites that are clearly defined and labeled with identifiable patterns, such as the letter "T" or "H", a circle, a rectangle, or a combination of these shapes, based on specific geometric principles, as described in Xin (2022) [27].The different types of landing area markings are depicted in Figure 2.

Static Landing Area
The localization of different types of markings relies on a range of techniques, from image processing to advanced machine learning.Localization of the "T" marking helipad achieves the maximum precision at specific poses using Canny Edge detection, Hough Transform, Hu invariant, Affine moments, and adaptive threshold selection [28,29].Localization of the "H" marking achieves a success rate of 97.3% using image segmentation, depth-first search, and adaptive threshold selection in [30], while it achieves the maximum precision at pose 0.56 • using image extraction and Zernike moments obtained in [31].
The "Circle" marking's localization is addressed in two studies, namely [32,33].Through the implementation of solvePnPRansac and Kalman filters, Benini et al. [32] were able to achieve a position error that is less than 8% of the diameter of the landing area, while maximum precision at pose 0.08 • using an Extended Kalman filter was achieved by [33].Detecting combined marking types is discussed in [34][35][36][37].The maximum precision at specific poses using template matching, Kalman filter, and a profile-checker algorithm was achieved in [34].Meanwhile, the other combined detection method predicts the maximum precision at pose 0.5 • using tag boundary segmentation, image gradients, and adaptive threshold [35].Ref. [36] achieves maximum precision at a position of less than 10 cm using Canny Edge detection, adaptive thresholding, and Levenberg-Marquardt. Lastly, the final combined detection approach obtains maximum precision at a position of less than 1% using HOG, NCC, and AprilTags [37].
Forster et al. [38] designed a system to detect and track a target mounted on the vehicle using computer vision algorithms and hardware.The system used in the Mohamed Bin Zayed International Robotics Challenge 2017 (MBZIRC) employed a precise RTK-DGPS to determine the target location, followed by a circle Hough transform to accurately detect the center of the target.By tracking the target, the UAV adjusted its trajectory to match the movement of the vehicle.The system successfully met the requirements of the task and was ranked as the best solution, considering various constraints.There are some limitations in the system, such as weather conditions and vehicle speed, which may affect its performance.Overall, this paper provides an interesting approach to solving an important problem in robotics research, which has potential applications in various fields such as aerial monitoring and humanitarian demining operations.An overview of the algorithms available for static landing site detection based on landing marking shape is presented in Table 2.

Dynamic Landing Area
Two categories of dynamic landing areas exist based on the motion of the platform, namely ship-based and vehicle-based landings.Due to the complexity of landing on a moving platform, LiDAR sensors are used in conjunction with computer vision techniques.The landing process is facilitated using a model predictive feedback linear Kalman filter, resulting in a landing time of 25 s and a position error of less than 10 cm [39].Another algorithm uses nonlinear controllers, state estimation, convolutional neural network (CNN), and velocity observer to achieve maximum precision at positions less than (10, 10) cm [40], while the algorithm employs a deep deterministic policy gradient with Gazebo-based reinforcement learning and achieves a landing time of 17.5 s with a position error of less than 6 cm [41].Lastly, the algorithm proposed in [42] uses extended Kalman, extended H∞, perspective-n-point (PnP), and visual-inertial data fusion to achieve maximum precision at positions less than 13 cm.
The algorithm presented in [43] utilizes extended Kalman and visual-inertial data fusion techniques to achieve a landing time of 40 s for ship-based landing area detection.Meanwhile, another algorithm employs a Kalman filter, artificial neural network (ANN), feature matching, and Hu moments to achieve a position error of (4.33, 1.42) cm [44].The approach outlined in [45] utilizes the EPnP algorithm and a Kalman filter, but no experimental results are presented.Battiato et al. [46] have introduced a system that enables real-time 3D terrain reconstruction and detection of landing spots for micro aerial vehicles.The system is designed to run on an onboard smartphone processor and uses only a single down-looking camera and an inertial measurement unit.A probabilistic two-dimensional elevation map, centered around the robot, is generated and continuously updated at a rate of 1 Hz using probabilistic depth maps computed from multiple monocular views.This mapping framework is shown to be useful for the autonomous navigation of micro aerial vehicles, as demonstrated through successful fully autonomous landing.The proposed system is efficient in terms of resources, computation, and the accumulation of measurements from different observations.It is also less susceptible to drifting pose estimates.An overview of the current dynamic landing methods is given in Table 3, which calls for more sophisticated algorithms because of how complicated the moving platform is.

Complex Landing Area
The complex landing area is a challenging task for autonomous landing systems.The terrain in these areas can have various obstacles and hazards, and it is not always possible to find a suitable landing area.Researchers have explored different methods for identifying safe landing areas in complex terrain, but the research in this area is limited.Fitzgerald and Mejia's research on a UAV critical landing place selection system is one such effort [43,44].To locate a good landing spot, a monocular camera and a digital elevation model (DEM) are used.This system has multiple stages, including primary landing location selection, candidate landing area identification, DEM flatness analysis, and decision-making.The method's limitations stemmed from the fact that it only used the Canny operator to remove edges and that the flat estimation stage's DEM computation lacked resilience.
Research on unstructured emergency autonomous UAV landings using SLAM was conducted during the year 2018, where the use of DEM and LiDAR in their approach are evaluated and their advantages and limitations are discussed [47].The research was conducted by using monocular vision SLAM and a point cloud map to identify the UAV and split the grid into different heights to locate a safe landing zone.After denoising and filtering the map using a mid-pass filter and 3D attributes, the landing process lasted 52 s, starting at a height of 20 m.The experimental validation was conducted in challenging environments, demonstrating the system's adaptability.In order to fulfill the demands of a self-governing landing, the sparse point cloud was partitioned based on different elevations.Furthermore, Lin et al. [48] deliberated landing scenarios in low illumination settings.
To account for the constantly changing terrain in complex landing areas, it is advisable to identify multiple suitable landing sites to ensure the safety of the UAV.Once landing coordinates have been identified, the optimal landing spot and approach path should be determined.Cui, et al. [49] proposed a way that calculates the landing area's criteria based on energy consumption, the safety of the terrain, and the craft's performance.A clearer comprehension of landing point selection is provided in Figure 3, which displays the identification of two landing targets-Target A and Target B-along with the corresponding trajectories to reach them.It is advisable to determine the shortest trajectory to the alternative target in the event of a last-minute change.

Landing on Extra-Terrestrial Bodies
For the above-said problem, we can also consider adopting the landing approach utilized by spacecraft on other extra-terrestrial bodies [50].Precise Landing and Hazard Avoidance (PL&HA) [51][52][53][54][55][56] and the Safe and Precise Landing Integrated Capabilities Evolution (SPLICE) are two projects that NASA has been working on [57].Moreover, NASA is working on a project called ALHAT, which stands for Autonomous Landing Hazard Avoidance Technology [58][59][60][61][62][63][64][65][66][67][68].However, a significant limitation of ALHAT is that it relies solely on static information such as slopes and roughness to select a landing area, making the maneuvering of the craft more challenging.S.R. Ploen et al. [69] proposed an algorithm that uses Bayesian networks to calculate approximate landing footprints and make it feasible.Researchers are currently working on similar aspects.In that case, the classical method of object classification can be employed to simplify the process of identifying objects in space.However, prior to delving into object classification, it is essential to determine the angle and pose of the spacecraft for greater accuracy.Subsequently, the article will delve into pose estimation techniques followed by object classification techniques.

Pose Estimation Techniques
Determining the spacecraft's pose requires an estimation based on the camera's angle with respect to the ground view.Deep learning models are mainly used for pose estimation.The previously available pose estimation techniques are studied by Uma et al. [70], which has been summarized as follows.
In their respective studies, Sharma et al.  [73] employed different techniques to analyze the PRISMA dataset.Sharma et al. [71] utilized the Sobel operator, the Hough transform, and a WGW approach to detect features of the target, regardless of their size.On the other hand, Harvard et al. [72] utilized landmark locations as key point detectors to address the challenge of high relative dynamics between the object and the camera.They also employed the 2010 ImageNet ILSVRC dataset in their approach.A deep learning framework that utilizes soft classification and orientation and outperformed straightforward regression is presented by Proencca et al. [73].The Dataset used in the study was the URSO, an Unreal Engine 4 simulator.According to Sharma et al. (2018) [74], their approach demonstrated higher feature identification accuracy compared to traditional methods, even when dealing with high levels of Gaussian white noise in the images.The SA-LMPE method used by Chen et al. [75] improves posture refinement and removes erroneous predictions, while the HRNet model correctly predicted two-dimensional landmarks, using the SPEED Dataset.Zeng et al. (2017) [76] utilized deep learning methods to identify and efficiently represent significant features from a simulated space target dataset generated by the Systems Tool Kit (STK).Wu et al. (2019) [77] used a T-SCNN model trained on images from a public database to successfully identify and detect space targets in deep space photos.Finally, Tao et al. (2018) [78] used a DCNN model trained on the Apollo spacecraft simulation Dataset from TERRIER, which showed resistance to variations in brightness, rotation, and reflections, as well as efficacy in learning and detecting high-level characteristics.

Object Classification
The objects of interest in this scenario are craters and boulders found on a particular extraterrestrial body.A crater is a concave structure that typically forms as a result of a meteoroid, asteroid, or comet's impact on a planet or moon's surface.Craters can vary greatly in size, ranging from small to very large.Conversely, a boulder is a large rock that usually has a diameter exceeding 25 cm.The emergence of boulders on the surfaces of planets, moons, and asteroids can be attributed to several factors including impact events, volcanic activity, and erosion.

Deep-Learning Approach
Once the spacecraft's pose has been determined, the objective is to land the spacecraft safely by avoiding craters and boulders.To achieve this objective, several algorithms have been developed.Li et al. (2021) [8] suggest a novel approach to detect and classify planetary craters using deep learning.The approach involves three main steps: extracting candidate regions, detecting edges, and recognizing craters.The first stage of the proposed method involves extracting candidate regions that are likely to contain craters, which is done using a structure random forest algorithm.In the second stage: the edge detection stage, the edges of craters are extracted from the candidate regions through the application of morphological techniques.Lastly, in the recognition stage, the extracted features are classified as craters or non-craters using a deep learning model based on the AlexNet architecture.Wang, Song et al. [9] propose a new architecture called "ERU-Net" (Effective Residual U-Net) for lunar crater recognition.ERU-Net is an improvement over the standard U-Net architecture, which is commonly used in image segmentation tasks.The ERU-Net architecture employs residual connections between its encoder and decoder blocks, along with an attention mechanism that aids the network in prioritizing significant features while being trained.

Supervised Learning Approach
Supervised detection approaches utilize machine learning techniques and a labeled training dataset in the relevant domain to create classifiers, such as neural networks (Li & Hsu, 2020) [79], support vector machines (Kang et al., 2019) [22], and the AdaBoost method (Xin et al., 2017) [23].Kang et al. (2018) [22] presented a method for automatically detecting small-scale impact craters from charge-coupled device (CCD) images using a coarse-to-fine resolution approach.The proposed method involves two stages.Firstly, large-scale craters are extracted as samples from Chang'E-1 images with a spatial resolution of 120 m.Then, the histogram of oriented gradient (HOG) features and a support vector machine (SVM) classifier are used to establish the criteria for distinguishing craters and non-craters.Finally, the established criteria are used to extract small-scale craters from higher-resolution Chang'E-2 CCD images with spatial resolutions of 1.4 m, 7 m, and 50 m.Apart from that Xin et al. [23] propose an automated approach to identify fresh impact sites on the Martian surface using images captured by the High-Resolution Imaging Science Experiment (HiRISE) camera aboard the Mars Reconnaissance Orbiter.The method being proposed comprises three primary stages: the pre-processing of the HiRISE images, the detection of potential impact sites, and the validation of the detected sites using the AdaBoost method.The potential impact sites are detected using a machine-learning-based approach that uses multiple features, such as intensity, texture, and shape information.The validation of the detected sites is done by comparing them with a database of known impact sites on Mars.
An automated approach for detecting small craters with diameters less than 1 km on planetary surfaces using high-resolution images is presented by Urbach and Stepinski [24].The three primary stages of the suggested technique include pre-processing, candidate selection, and crater recognition, with the pre-processing stage transforming the input image to improve features and minimize noise.In the candidate selection step, a Gaussian filter and adaptive thresholding are used to detect potential crater candidates.In the crater recognition step, a shape-based method is employed to differentiate between craters and non-craters.It is shown that the suggested technique works well for finding tiny craters on the Moon and Mars.

Unsupervised Learning Approach
The unsupervised detection approach utilizes image processing and target identification theories to identify craters by estimating their boundaries based on the circular or elliptical properties of the image [80].The Hough transform and its improved algorithms (Emami et al., 2019) [17], the genetic algorithm (Hong et al., 2012) [18], the radial consistency approach (Earl et al., 2005) [19], and the template matching method (Cadogan, 2020;Lee et al., 2020) [20] are among the common techniques utilized for this method.The morphological image processing-based approach for identifying craters involves three primary steps: firstly, the morphological method is used to identify candidate regions, followed by the removal of noise to pinpoint potential crater areas; secondly, fast Fourier transform-based template matching is used to establish the association between candidate regions and templates; and finally, a probability analysis is utilized to identify the crater areas [21].The advantage of the unsupervised approach is that it can train an accurate classifier without requiring the labeling of a sizable number of samples.This strategy can be used when an autonomous navigation system's processing power is constrained.Nevertheless, it struggles to recognize challenging terrain.

Combined Learning Approach
To detect craters, a combined detection methodology employs both unsupervised and supervised detection methods.For example, consider the KLT detector, which is a combination detection technique, to extract probable crater regions [25].In this approach, supervised detection methodology was used, and image blocks were used as inputs, while the detection accuracy was significantly influenced by the KLT detector's parameters.Li and Hsu's (2020) [26] study employed template matching and neural networks for crater identification.However, this approach has the drawback of being unable to significantly decrease the number of craters in rocky areas, leading to weak crater recognition in mountainous regions.Li et al. [8] propose a three-stage approach for combined crater detection and recognition.In the first stage, a structured random forest algorithm is utilized for extracting the crater edges.The second stage involves candidate area determination through edge detection techniques based on morphological methods.The third stage involves the recognition of candidate regions using the AlexNet deep learning model.Experimental results demonstrate that the recommended crater edge detection technique outperforms other edge detection methods.Additionally, the proposed approach shows relatively high detection accuracy and accurate detection rate when compared to other crater detection approaches.

Finding the Altitude from an Image
Once the landing area is identified along with the pose of the vehicle, the altitude of the drone can be calculated.It can be done by using sensors or by using computer vision techniques.The sensors such as altimeter, LiDAR, or RADAR can be used to find the altitude of the drone.Another way to calculate the altitude is by computer vision techniques [81].In order to apply computer vision techniques, the height of the object in the image needs to be known, which can be determined using methods such as Structure from Motion (SFM) and Photogrammetry.

Structure from Motion Methods
The method known as Structure from Motion (SFM) analyzes the motion of an object in a scene to estimate its 3D structure from a 2D image [82].To calculate the height, at least two images of the object captured from distinct viewpoints are necessary.By examining changes in object position between two images, SFM can ascertain its relative location in 3D space, including height.This approach demands a significant amount of computational power and a solid grasp of computer vision and 3D geometry.Figure 4 gives a pictorial representation of the Structure from Motion Model.

Photogrammetry
Photogrammetry refers to a technique of obtaining measurements by analyzing the appearance of an object in photographs and taking into consideration various camera parameters such as size and focal length.By doing this, it is feasible to calculate the object's height from the photograph [83].Several factors affect the accuracy of photogrammetry, including image quality, object location within the image, and camera parameters such as focal length, position, orientation, and lens distortion.The camera's focal length, which is the distance between the camera sensor and the lens, can vary depending on the camera type.For example, a regular camera has a focal length of approximately 30 mm, while a camera with a wide-angle lens has a focal length of around 152 mm.In contrast, a camera with a super wide-angle lens typically has a focal length of 88 mm, and the focal length of a mobile phone camera is usually greater than 300 mm.The brief methods of camera calibration were studied by Roncella et al. in [84].Figure 5 gives an elaborate view of image formation in a digital camera.Photogrammetry uses accurate 2D maps and elevation models of the environment, which can be useful for the navigation and localization of the UAVs.There are four types of photogrammetry used for drones, which will be discussed next.

Nadir Photogrammetry
Nadir Photogrammetry is a method of photogrammetry that involves capturing aerial images of an object or area from directly overhead, typically from a camera mounted on a drone or an aircraft [85].The term "nadir" refers to the point directly beneath the camera lens, which is usually the highest point of the object or area being photographed.

Convergent Photogrammetry
Convergent photogrammetry is a method of photogrammetry that involves capturing multiple images of an object or scene from different positions and angles to create a 3D model.The term "convergent" denotes that the optical axes of the camera lenses employed to capture images meet at a single point, usually at the center of the object or scene under scrutiny [86].Both Nadir and Convergent photogrammetry are commonly used in a variety of applications, including surveying, mapping, architecture, and industrial design.They have the ability to create precise and elaborate 3D models of objects and surroundings, providing valuable aid in visualization, analysis, and design applications [87].

Low Oblique Photogrammetry
Low oblique refers to a specific type of aerial photography or photogrammetry in which images of an object or area are taken from an aerial viewpoint that is slightly tilted or inclined [47].Low oblique photography involves mounting the camera at an angle ranging from around 30 to 60 degrees away from the vertical axis.Compared to high oblique photography, in which the camera is tilted at a steeper angle, low oblique photography captures more of the horizon and background, providing additional context to the object or area being photographed.However, low oblique photography can also introduce more distortion and parallax errors compared to nadir photography, which may require additional processing or corrections to achieve accurate results.

High Oblique Photogrammetry
Similar to low oblique photography, high oblique images are captured by mounting the camera at an angle away from the vertical axis.However, in high oblique photography, the camera is typically positioned at an angle greater than 60 degrees [88].Compared to low oblique photography, in which the camera is tilted at a shallower angle, high oblique photography captures less of the horizon and background, but provides a more detailed view of the object or area being photographed.However, high oblique photography can introduce more distortion and parallax errors compared to nadir photography, which may require additional processing or corrections to achieve accurate results.
Once the size of the object, which is our reference present in the image, is known, then from that the altitude of the drone can be measured.To measure the altitude, there are some constraints to be satisfied.Firstly, the image taken should be a non-cropped one and must be as flat as possible.While taking the picture, the lens plane must be parallel to the ground plane, which will help to minimize the distortions and ensure accurate measurements of elevation.Secondly, the line to measure must be parallel to one of the edges of the picture, and it can be a horizontal or vertical one.The reference object must be centered in the middle of the picture to minimize the lens distortion.Lastly, knowledge of the camera sensor type and its focal length is also necessary.After fulfilling all these conditions, determining the altitude of the drone using the subsequent equation becomes a straightforward task.
Alt = FL/RF (1) The Equation ( 1) is used to find the altitude of the drone.Here, Alt denotes the altitude of the drone, FL denotes the focal length of the camera and RF denotes the representative fraction.
The Equation ( 2) represents the representative fraction on the image.In order to obtain the representative fraction, the size of the object as printed in the camera (Si) is divided by its actual size in the real world (Sr).Figure 6 gives a pictorial overview of different types of photogrammetry and Figure 7 gives a detailed view of how the flying height is calculated using a single image, where 'Sr' is the size of the object, 'Si' is the size of the object printed in the camera, and 'r' and 'd' are the angles of vision.

Preliminary Work and Results
There are various methods to extract helipad features, but the most efficient and quickest one is using image matching functions or using image filters for the extraction of the desired object.The preliminary work covers both techniques by applying nonlinear spatial filters and image-matching techniques.On the other hand, image-matching techniques include algorithms such as Scale Invariant Feature Detection (SIFT), Speed UP Robust Feature (SURF), Oriented FAST and Rotated BRIEF (ORB), Accelerated KAZE (AKAZE), and Binary Robust Invariant Scalable Key point (BRISK).The above-mentioned algorithms utilize key points for matching.Key points refer to informative and distinct features in images that are utilized in computer vision applications such as feature matching, object recognition, and image retrieval.They can also be referred to as feature points, interest points, or blobs.To establish correspondences between images in feature matching, key points are identified and described in both images.These key points are then matched according to their descriptors, and the resulting matches can be employed for various purposes, including image alignment, object tracking, and image stitching.
The experimental benchmark dataset used here mainly contains 6000 image data of size (575 × 575).The data are taken from the HelipadCat repository [89] as it consists of aerial images of helipads based on the FAA's database on US airports.The dataset is created in a way to classify via visual shape and features of the object.Figure 8 shows a sample set of data used for the extraction.The images depict helipads in various contexts, such as airports, hospitals, and military bases, made of different materials, in different shapes and sizes, and in different lighting and weather conditions.Each image is labeled with one or more tags that describe its content, such as location, helipad type, and material.The dataset was created to serve as a benchmark for researchers working on helipad detection and identification using computer vision.Additionally, it consists of helipads situated on land-static area, ship-dynamic area, and place without markings of any kind-complex area.

Using Standard Filters
Initially, the landing area is isolated from an image through simple image processing methods.The image is converted into a grayscale representation using a transformation function.Subsequently, a nonlinear spatial filter, specifically a 5 × 5 median filter, is implemented to eliminate noise while retaining the sharpness of object edges within the image.Finally, a threshold equivalent to 30% of the maximum intensity value is utilized to acquire the landing area.To further refine the landing area, small and large objects around the intended landing area are removed using image segmentation and connected component labeling.Line detection is then performed to precisely identify the landing area.Finally, a bounding box is drawn around the landing area to differentiate it from other objects in the image.Figure 9 gives an overview of each process output.

Using Feature Matching Functions
SIFT, SURF, ORB, AKAZE, and BRISK are all feature detection and description algorithms commonly used in object detection and matching.In order to use these algorithms, there is a need for a reference image and a target image.Figure 10 shows such an image in which the reference image indicates the clear helipad image data taken from the vector stock, and the target image indicates the image data taken through aerial photography using an unmanned aerial vehicle (UAV).SURF (Speeded-Up Robust Features) is a faster version of SIFT that uses a Hessian matrix approximation to identify key points and extract descriptors [90].SURF is designed to be faster and more efficient than SIFT while maintaining similar accuracy and robustness.Figure 11 gives the matching result of SURF with reference and the target image.
The patented Scale-Invariant Feature Transform (SIFT) algorithm, introduced in Lowe's work [91], is designed to identify and characterize local features in images that remain invariant to scaling and rotation.This is accomplished by detecting key points, determining their scale-space extrema, and extracting descriptors that are based on the gradient orientation of the pixels around them.SIFT is widely used in numerous applications, including object recognition, image registration, and 3D reconstruction.Figure 12 gives the matching result of SIFT for reference and the target image.Rublee et al. [92] presented the ORB (Oriented FAST and Rotated BRIEF) algorithm, which combines the FAST (Features from Accelerated Segment Test) and BRIEF (Binary Robust Independent Elementary Features) algorithms.ORB, designed for real-time applications such as mobile robotics and Simultaneous Localization and Mapping (SLAM), outperforms SIFT and SURF in terms of speed and efficiency.Figure 13 gives the matching result of ORB for reference and the target image of the helipad.AKAZE (Accelerated-KAZE) [93] showed an improvement over KAZE (a variant of SIFT) that uses nonlinear diffusion to enhance the detection of local features.AKAZE is better suited for high-speed feature detection and description applications due to its superior speed and robustness compared to SIFT and SURF. Figure 14 gives the matching result of AKAZE for reference and target image.
The BRISK (Binary Robust Invariant Scalable Keypoints) algorithm, described in Leutenegger et al.'s work [94], is a feature detection and description technique that merges the speed and efficiency of binary feature descriptors with the resilience of scale-invariant feature detectors.BRISK is intended to outperform SIFT and SURF in terms of speed and efficiency while preserving similar accuracy and robustness levels.Real-time applications such as detection, identification, and tracking benefit from it in particular.Figure 15 gives the matching result of BRISK for both reference and target image.The number of key point matches for each algorithm are as follows: SIFT-28, ORB-42, AKAZE-26, BRISK-33, and SURF-37.A bar chart depicting the comparison of matched key points for these algorithms is presented in Figure 16.Once the landing area has been identified, the altitude at which the UAV flies is determined by dividing the height of a reference object within the image by its corresponding size in the physical world.Specifically, the helipad present in the target image serves as the reference object, and its standard size in the aviation industry is shown in Figure 17.The resulting quotient is then divided by the focal length of the camera, which is roughly 16.5 mm in this scenario, yielding the UAV's flying height.

Conclusions
The development of an autonomous landing system for UAVs necessitates the incorporation of a reliable vision-based navigation system that can deliver sequences of environmental data.Despite the potential of autonomous landing, the noisy nature of vision information obtained from the navigation system can lead to erroneous control decisions, especially in environments characterized by static, dynamic, and complex features.This paper has presented a comprehensive review of the application of photogrammetry with feature extraction techniques by a vision-based system to enhance the accuracy of control decisions during autonomous landing.The technique involves extracting features from static imaging data to facilitate more accurate decision-making during autonomous landing.Photogrammetry on static imaging has limitations in terms of accuracy, depth perception, coverage, and lighting conditions.Inaccurate measurements can result from factors such as camera calibration errors, low-quality images, or limitations in measurement techniques.In terms of coverage, photogrammetric techniques can only extract information from the images that are visible to the camera.This can limit the amount of information available for decision-making during autonomous landing.Additionally, lighting conditions can impact photogrammetry measurements, leading to inaccurate results from shadows, reflections, and glare.
Moreover, the accuracy of the supplied image is a key factor in photogrammetric approaches.Poor photo quality, such as images that are poorly exposed, out of focus, or have motion blur, can negatively impact the accuracy of photogrammetric measurements.To overcome these limitations, it is necessary to carefully consider the acquisition and processing of images to make sure they're accurate and reliable for photogrammetric measurements.Additionally, other techniques, such as LiDAR or radar, can be used in combination with photogrammetry to provide more accurate and comprehensive information for decision-making during autonomous landing.Furthermore, this work explores the challenges of landing on extra-terrestrial bodies and provides an in-depth analysis of pose estimation and object classification techniques.The work also encompasses the application of structure from motion and photogrammetry methods for determining image altitude and provides a deeper understanding of various photogrammetry techniques.This study enhances our understanding of the use of vision-based navigation systems for autonomous landing and sheds light on how photogrammetry can enhance the precision of control decisions during autonomous landing.Future research could explore the integration of other landing systems with photogrammetry to enhance the overall performance of the autonomous landing system.
Static, Dynamic, and Complex, with Static locations further subdivided into cooperative target-based, and Dynamic locations being classified into vehicle-based and ship-based locations.Xin et al. [27] discuss cooperative target-based autonomous landing, which is further classified into classical machine learning solutions such as Hough transformation, template matching, Edge detection, line mapping, and sliding window approaches.The author begins with basic clustering algorithms and progresses through deep learning algorithms to address the static location.

Figure 5 .
Figure 5. Image Formation in Digital Cameras.

Figure 7 .
Figure 7. Calculating the altitude of the drone from a single image.

Figure 8 .
Figure 8. Different kind of landing area present in the dataset.

Figure 11 .
Figure 11.Matching of Reference and Target image using SURF.

Figure 12 .
Figure 12.Matching of Reference and Target image using SIFT.

Figure 13 .
Figure 13.Matching of Reference and Target image using ORB.

Figure 14 .
Figure 14.Matching of Reference and Target image using AKAZE.

Figure 15 .
Figure 15.Matching of Reference and Target image using BRISK.

Figure 16 .
Figure 16.Comparison of key points detected by different algorithms.

Table 2 .
Previously available algorithms for Static Landing Areas.

Table 3 .
Previously available algorithms for Dynamic Landing Areas.
Table 4 presents the obtained resultant values.

Table 4 .
Results obtained.Figure 17. Standard size of a helipad used in the aviation industry.