ELCD: Efﬁcient Lunar Crater Detection Based on Attention Mechanisms and Multiscale Feature Fusion Networks from Digital Elevation Models

: The detection and counting of lunar impact craters are crucial for the selection of detector landing sites and the estimation of the age of the Moon. However, traditional crater detection methods are based on machine learning and image processing technologies. These are inefﬁcient for situations with different distributions, overlaps, and crater sizes, and most of them mainly focus on the accuracy of detection and ignore the efﬁciency. In this paper, we propose an efﬁcient lunar crater detection (ELCD) algorithm based on a novel crater edge segmentation network (AFNet) to detect lunar craters from digital elevation model (DEM) data. First, in AFNet, a lightweight attention mechanism module is introduced to enhance the feature extract capabilities of networks, and a new multiscale feature fusion module is designed by fusing different multi-level feature maps to reduce the information loss of the output map. Then, considering the imbalance in the classiﬁcation and the distributions of the crater data, an efﬁcient crater edge segmentation loss function (CESL) is designed to improve the network optimization performance. Lastly, the crater positions are obtained from the network output map by the crater edge extraction (CEA) algorithm. The experiment was conducted on the PyTorch platform using two lunar crater catalogs to evaluate the ELCD. The experimental results show that ELCD has a superior detection accuracy and inference speed compared with other state-of-the-art crater detection algorithms. As with most crater detection models that use DEM data, some small craters may be considered to be noise that cannot be detected. The proposed algorithm can be used to improve the accuracy and speed of deep space probes in detecting candidate landing sites, and the discovery of new craters can increase the size of the original data set.


Introduction
Impact craters constitute an important property of the lunar surface. Impact craters provide significant information for lunar evolution [1,2]. For example, the distribution and number of craters are often used to estimate the relative age of the Moon [3][4][5], and craters also provide important landmark information to accurately guide spacecraft to land [6,7]. The discovery of impact craters on the lunar surface is very important for studying the Moon, for example, by using the manual analysis and comparative evaluation of craters' images with different features to identify the permanently shadowed lunar polar regions [8]. In the study of crater counting, some crater catalogs have been formed manually by planetary scientists, such as the crater catalog of the Moon (diameter 5∼20 km [9], diameter ≥ 20 km [10]). However, the manual discovery of craters is time-consuming and laborious, and because experts may disagree on the interpretation of image data, the manual marking of craters also faces consistency and repeatability challenges.
Several automatic crater detection algorithms have been proposed to detect craters, and these can be roughly grouped into two categories. The first kind of method is unsupervisedbased algorithms, which use digital image processing technology to detect craters, and the second kind of method is supervised-based algorithms, which employ machine learning or deep learning to extract impact craters.
Unsupervised-based automatic crater feature extraction algorithms are mainly based on traditional image processing methods, including Hough transforms [11][12][13], template matching [14], edge detection, convex grouping [15], and other recognition techniques. For example, the performance of a Hough transform applied to large scale crater counting was evaluated [13] in terms of its ability to automatically detect craters down to sub-km sizes on high-resolution images of the Martian surface. The Canny edge detector is widely used in computer vision to locate sharp intensity changes and to find object boundaries in an image. The combined adaptive Canny algorithm, which uses histograms of images and multi-scale Gaussian filtering, was used in [16] to achieve a crater matching rate of better than 85%. However, for irregular, incomplete shapes and areas with a high degree of overlap, the detection accuracy of such methods is poor. Furthermore, Chen et al. [17] used terrain analysis and mathematical morphology methods to identify different types of impact craters, which fit the crater edge based on the Moon's digital elevation model (DEM) data. In contrast, the mathematical fitting method is more reliable than the Hough ring transform algorithm, but its computational complexity is higher for the identification of large, dense craters.
Automatic crater supervised-based technology has developed rapidly through machine learning and deep learning methods. Machine learning-based methods often involve building a classifier to recognize candidate craters, and common classifiers, such as the principal components analysis [18], decision trees technique [19], support vector machine, and other hybrid methods [20], are used to classify candidate craters. To improve the classification accuracy of small craters, Kang et al. [21] combined a histogram of oriented gradient features and the support vector machine classifier to extract small-scale impact craters from charge-coupled device images. Furthermore, based on the scale of training samples generated from the surface imagery and digital elevation models of the Moon, [22] proposed an active machine learning approach to automatically detect candidate craters by training a classifier with better performance. These methods are able to recognize craters or non-craters with a high classification accuracy. However, they need to extract features manually when training a classifier to detect craters. For large-scale and high-density crater detection, most of them have poor recognition accuracy and robustness. Some of them cannot count craters or locate the positions of craters.
Deep learning, especially when based on convolution neural networks (CNNs), has achieved great success in solving problems with image classification, image segmentation [23,24], and synthetic aperture radar (SAR) automatic object detection [25,26] in the remote sensing fields. The CNN is a key representative network structure in deep learning techniques. Such techniques are different from machine learning techniques, which are more efficient and portable without a set of human-designed features [27]. Impact crater detection based on deep learning is an important method in the vision-based navigation systems and is used to solve the task of pinpoint landing on the Moon. Some works [28,29] have used CNN feature extraction and standard image processing technologies to detect and match the observed craters, which were used as visual landmark measurements by the navigation filter. Moreover, image segmentation [23,30] and object detection methods based on CNNs, e.g., faster region-CNN (R-CNN) [31] and mask R-CNN [32], are used to solve crater detection problems. For example, Tewari et al. [32] utilized the mask R-CNN framework to detect craters from optical images, digital elevation maps, and slope maps by post-processing to eliminate duplicate craters and extract the craters' global locations. Moreover, to improve the detection accuracy of small-impact craters, [33] proposed an end-to-end high-resolution feature pyramid network framework, denoted as HRFPNet. HRFPNet uses a new backbone with a feature aggregation module to enhance the feature extraction capability of small craters from thermal infrared imaging on Mars. However, most object detection-based methods need to consider the generation of the number of candidate boxes. For highly overlapping and dense craters, the quality of the generation of a large number of duplicate bounding boxes may affect the recognition speed and accuracy of crater detection. Therefore, most object detection schemes display relatively poor performances and high levels of computational complexity in crater detection.
Crater detection is also solved as a semantic segmentation problem, in which the rims or edges of craters can be extracted by pixel-level classification, and the crater position and size can be obtained by a post-pipeline method. For example, a semantic segmentation method based on the fully convolutional neural network was proposed [34]. This method uses different feature maps with multi-scale receptive fields to detect multiscale impact craters from remotely sensed planetary images. Moreover, semantic segmentation models [35][36][37] based on U-net [38] have been presented to detect craters. Silburtet et al. [35] proposed DeepMoon based on the U-net network structure to recognize lunar craters from DEM data. This method can successfully identify about 45% of newly discovered craters in its validation data. However, the U-net network structure loses large amounts of detailed information in the encoder of the network, which leads to poor crater image contour recovery in the decoder process. To improve the accuracy of crater detection, a new network structure, ERU-Net [36], introduced the deep residual network module to improve the crater feature extraction ability. This successfully achieved a recall rate of 81.2% and a precision rate of 75.4% in lunar crater recognition when training 30,000 DEM data images. Furthermore, to explore craters on Mars, DeLatte et al. [37] employed segmentation convolution neural networks based on U-net for automatic crater detection from Martian daytime infra-red images. This method identified 65-75% of craters in common with a human-annotated dataset, and [39] used the ResUNET [30] model to detect craters with the global maps and infra-red imagery for Mars. However, resources in the deep space environment are limited [40]; thus, automated crater detection methods require a balance between model computational complexity and identification efficiency. Most of the above methods ignore the computational complexity of the model.
The deep learning-based algorithms described above have different improved optimization approaches for different crater tasks. However, the majority of object detection schemes perform relatively poorly as they are constrained by their vanilla network architectures or semantic segmentation. By comparing the network complexity and recognition results, it can be seen that crater detection methods based on the semantic segmentation model are more efficient than the end-to-end object detection model. However, most semantic segmentation-based crater detection methods mainly focus on the accuracy of recognition and neglect the reasoning speed of the network. Moreover, due to crater images having different distributions, degrees of overlap, and sizes on the surface of the Moon, and because the crater data may be imbalanced, crater detection algorithms based on semantic segmentation networks may suffer from significant performance degradation. Therefore, achieving a fast and effective crater detection method with a high level of precision based on a semantic segmentation model represents a challenging scenario.
To address this issue, in this study, we establish an efficient lunar crater detection (ELCD) algorithm that addresses the requirements for accurate and fast crater detection. In the ELCD algorithm, first, the crater edge is segmented by the attention mechanisms and multiscale feature fusion networks (AFNet). Then, the crater position and size are extracted by postprocessing based on the crater extract algorithm (CEA). In AFNet, a light-weight attention mechanism is used to improve the feature extraction ability of the network, and a new multiscale feature fusion (MFF) module is designed in the upsampling process of the network to reduce the loss of detail in the semantic segmentation results. In addition, we consider the crater data imbalance of the classification and distributions and design a new crater edge segmentation loss (CESL) function for network training. The proposed loss function improves the optimization ability and convergence speed of the network through adaptive balance weights.
The main contributions of the paper are as follows: • We propose an efficient crater detection network based on a new semantic segmentation network architecture, AFNet, which uses the lightweight attention mechanism and multiscale feature fusion module to provide better and faster detection of lunar impact craters. • To improve the optimization capability of the network, we present the crater edge segmentation loss function, which considers the imbalance of classification and distributions of crater data to calculate the loss value using the different degrees of imbalance in the data. • The experiment is conducted on the PyTorch platform [41] with lunar DEM data to verify the effectiveness of the ELCD. The results show that the ELCD outperforms the state-of-the-art crater detection models in terms of its detection accuracy and inference speed.
The rest of this paper is organized as follows: Section 2 describes the proposed network architecture, the design of the crater edge segmentation loss function, the crater edge extraction algorithm, and the details of the experiment. Section 3 provides the experimental results, and Section 4 presents our discussion. Eventually, in Section 5, we conclude our work.

Materials and Methods
The workflow description of two stages of the lunar crater detection method using DEM data is shown in Figure 1. The workflow includes two parts: (i) crater edge prediction by the semantic segmentation network AFNet and (ii) crater edge extraction with the post-pipeline method. The details of the ELCD are as follows. The workflow input is the lunar crater DEM image. The DEM contains abundant 3D morphology and topography morphological characteristics, and it is insensitive to light [27]. The workflow output is the crater's positional information, such as its longitude, latitude, and radius, which is determined by the crater edge extraction algorithm. First, crater images with different degrees of size, overlap, and distribution are transferred to the crater edge segmentation network to undergo crater edge prediction. Then, the network prediction results are processed with a post-processing pipeline based on the match template method to obtain the location information and radial size of craters. The digital elevation model (DEM) image is first processed by AFNet to recognize crater edges by pixel-level classification. Then, the prediction result of the crater images from network training is processed by a post-processing pipeline based on the match template method to detect the location information and radius size of craters.

AFNet
To obtain efficient crater edge prediction results, we formally describe the crater edge detection network architecture, as shown in Figure 2. The AFNet includes three parts: the network encoder, feature fusion, and decoder. In Figure 2, the black line is the network encoder, the blue line denotes the process of feature fusion, and the orange line represents the network decoder process. The network input is the gray DEM image, which has a fixed size of 256 × 256 pixels, and the output is the pixel-level classification for the prediction result. Figure 2. AFNet framework based on the improved VGG-16. The input is the DEM image transferred to the network encoder process (green trapezoid). First, the DEM image is processed with a 1/N downsampling rate with an attention mechanism module (pink circle) and five convolution blocks. Then, feature maps with different resolutions are saved and fused by the multiscale feature fusion module (blue line) with element-wise summation (green ) and the data blending block (blue squares) through the decoder process (yellow squares) to get a more fine-grained output feature map. The final output result denotes the network prediction results with pixel-level classification.
In encoder processing, we use the visual geometry group-16 (VGG-16) [42] as the backbone to extract the crater features. This allows us to obtain a bigger receptive field using fewer parameters compared with other network structures. The backbone network includes five feature extraction blocks, denoted as L = {L 1 , L 2 , . . . , L i }, where i is the number of feature extraction blocks. At the end of each feature extract block, we introduce the attention mechanism module to extract the important features of the crater. In L 1 and L 2 , each feature extraction block contains two convolution layers: an attention machine module and a max-pooling layer. L 3 , L 4 and L 5 contains three convolution layers, an attention machine module, and a max-pooling layer, and all convolution layers use a 3 × 3 convolution kernel in each block.
In feature fusion, to obtain a more fine-grained feature map in the network decoder, we designed a simple and efficient MFF to obtain more fine-grained output feature maps. The four fusion modules f usion j are shown in Figure 2 and j = {1, 2, 3, 4}. The MFF first uses the element wisdom summation (green in Figure 2) to fuse a low-resolution feature map and a high-resolution feature map in each step of the upsampling process (decoder). Then, the obtained fusion feature map is blended and transferred to the decoder process as an input for the next step (blue squares in Figure 2).
In decoder processing, the bilinear interpolation operation is used to restore the size of the feature maps by four decoder blocks, Decoder k , k = {1, 2, 3, 4} (yellow squares in Figure 2). We use 2× upsampling and fuse more rice feature map information in each decoder to restore the feature map to its original size.

Attention Mechanism Module
The original impact craters have different density distributions, sizes, and degrees of overlap in the different lunar regions. A description of the characteristics of crater DEM data used in network training is given in Figure 3. When the crater DEM images are processed by random clipping, they may have an incomplete shape. These crater characteristics bring performance challenges to the semantic segmentation network.

DEM Image
Ground-truth In the encoder, to improve the feature extraction ability of the network, we introduce the attention mechanism through efficient channel attention (ECA) [43], which is attached to the end of each feature extraction block of the proposed network to enhance the extraction of important features. Efficient channel attention with the lightweight module has great potential to produce a trade-off between performance and complexity. This only involves a handful of parameters while bringing a clear performance gain. The ECA block is termed an attention mechanism, as shown in Figure 2 with a pink circle. In the ECA, 1D convolution with a kernel size of 3 was used to achieve information exchange between channels. The details of the ECA block attached to the end of the five feature blocks are given in Figure 4. The ECA module was placed behind the activate function rectified linear unit (ReLU) in each feature extraction block. Figure 4a denotes the location of the ECA in the feature extraction block {L 1 , L 2 }, and Figure 4b shows the location of the ECA in the feature extraction block {L 3 , L 4 , L 5 } in the decoder process of the network. The ECA can combine the crater channel and spatial attention to enhance crater feature aggregation, which can enhance the extraction of salient crater features.

Multiscale Feature Fusion Module
Visual features with a coarse spatial resolution can be obtained by the encoder process. During the network encoder process, shallow crater networks can learn some local features because of the low perception threshold, and the deeper convolution layer can obtain more abstract features. With the deepening of the network, the receptive field of the network becomes larger, but because of the down-sampling operation, a great deal of detailed information may be lost. The purpose of the decoder process is to obtain a segmented prediction image with the same input size through the upsampling operation. Traditional segmentation networks use the simple upsampling module with skipped lateral connections to restore the feature map, which may cause the restored feature map to lack detailed features. To overcome the problem of poor image contour recovery in the decoder process, we designed a simple and efficient multiscale feature fusion module to fuse more low-layer features in each decoder block. The four multiscale feature fusion modules f usion j , j = {1, 2, 3, 4}, are shown in Figure 5. We first obtained feature maps of different resolutions from the network encoder process. Then, we fused two close feature maps as the upsampling input for the MFF to obtain an output feature map with more fine-grained information.  The MFF included two cases, direct fusion and indirect fusion, as shown in Figure 5. Figure 5a shows direct fusion for two feature maps of the same resolution: f usion 1 − f usion 3 . Figure 5b denotes indirect fusion with two different resolution feature maps: f usion 4 . In direct fusion, a low-resolution feature map denoted as Lp and high-resolution feature map represented as Hp have the same resolution. They use direct fusion by the element summation operation to obtain the fusion feature map. However, the size of the feature map is often different and usually has a two-fold difference in size after the encoder process. Therefore, processing is done through the indirect fusion module. In the indirect fusion module, the low-resolution feature map is not the same as the high-resolution feature map. Lp is first processed with the size alignment module to obtain the same resolution as Hp. Then, the two maps are fused by element summation to obtain the fusion feature map. Finally, the fusion feature map is transferred to the blending block (blue squares in Figure 2) to obtain the final fusion output, the indirect fusion feature map denoted as IFp or the direct fusion feature map indicated as DFp, as the branch input for upsampling processing.
The size alignment module includes a 1 × 1 convolution kernel to reduce the dimensions and a 3 × 3 convolution kernel. This stride is set to 2 to adjust the map to the same size as Hp. The blend module contains the simple two 3 × 3 convolution kernel network to blend the fusion results. The final fusion feature maps have richer low-layer features, which could help us to obtain high-quality output prediction results in the encoder process.

Crater Edge Segmentation Loss Function
In the crater prediction network, crater images can be divided into foreground images and background images by pixel-level segmentation. In a crater image, the foreground image is the segmented object (crater edge), and the background image represents everything but the object. However, most crater detection methods based on segmentation networks use traditional loss functions, such as the cross-entropy (CE) loss function [35][36][37], to train the network, and they cannot overcome the variation in size and the serious crater data imbalance problem, resulting in a performance decrease. The CE can be computed as where p i is prediction value of the network, y i is the ground-truth, and p i ∈ [0,1], y∈ {0,1}. However, in cross-entropy loss, the weight of each sample is the same, and the CE loss is overwhelmed when facing the data classification imbalance. Later, the focal loss (FL) function [44] considering the classification imbalance in dense object detection was proposed to improve the network performance. The FL is defined as where α is a weighting factor, α ∈ [0,1] for class 1 and 1 − α for another class; (1 − p i ) γ denotes the modulating factor; and γ denotes the tunable focusing parameter. The FL can balance the importance of positive and negative examples and differentiate between easy and hard examples by modulating the two factors α, and γ. Inspired by the FL [44], we propose a novel crater edge segmentation loss function to optimize the proposed network. In contrast to FL, only the classification imbalance of data was considered when designing the loss function. In this paper, two data imbalance factors were considered, including the classification imbalance and the distribution imbalance of crater data, and the modulating factor of the loss function was set adaptively. We first calculated the imbalance characteristics of the crater data, which are shown in Figure 6. We used the data imbalance ratio (IR) to represent the crater data classification imbalance. This is the ratio between the numbers of majority class samples (background) and the minority class samples (object). The crater classification imbalance is shown in Figure 6a. Moreover, we counted the distribution imbalance ratio (DR) as the number of craters in each label's image, as shown in Figure 6b.
We set the parameters α and γ adaptively based on the DR and IR of the crater data for the CESL. The proposed craters edge segmentation loss function can be computed as where α is used to adjust the weights of different categories, and γ is employed to differentiate between easy and hard examples. In this work, our goal was to accurately detect lunar craters. Some crater images are easy to distinguish, while others are difficult to distinguish. Simple examples show that the distribution of some craters is sparse and complete with little overlap, making these craters easy to detect. These crater images are shown in Figure 3a,b. The hard example shows that, in the lunar crater image, the distribution of craters is dense with high overlap, and the shape is incomplete. These crater images are shown in Figure 3b,c. α and γ were calculated based on the average value in each trained batch. IR b and DR b are the classification imbalance ratio and the distribution imbalance ratio in the b-batch of network training.
(a) Crater Classification Imbalance (b) Crater Distribution Imbalance Figure 6. Data distribution statistics of impact craters. We randomly generated 30,000 crater training images to show the imbalanced distribution. (a) is the crater image classification imbalance, the x-axis is the data imbalance ratio (IR) [45], and the y-axis denotes the frequency distribution of the IR. (b) shows the distribution imbalance ratio (DR) of the craters, the x-axis denotes the number of craters in each training image, and the y-axis represents the frequency distribution of the crater number in the DEM images.
In our crater data, we found that classification imbalance was common in the training data of each DEM image, and we calculated the max IR to be about 266 times and the average IR to be about 26 times, as shown in Figure 6a. In the crater training image, the densest crater image has 112 craters, and the average crater number is 20, as shown in Figure 6b. We defined the data imbalance degree in three cases based on the imbalance characteristics of craters, namely, low classification imbalance, median classification imbalance, and high classification imbalance. To balance the proportions of the data distribution, we calculated the ratio of three imbalance degree cases, which are more balanced when the ratio is about 3:2:1 in the crater training data, and the range of the corresponding IR is IR > 40, 20 < IR <= 40, and IR > 40. The value of α was set by the degree of imbalance, where IR b was used to adjust the data imbalance with different weights. α was set as Moreover, in general, highly overlapping, dense data may have a bad effect on crater classification. Thus, we also considered the craters' sparse distribution characteristics to improve the crater classification accuracy by setting the different values of γ . The craters' sparse distribution characteristics DR were represented by the crater number in the DEM images. We defined DR b by the crater number in the DEM images to set γ . The parameter γ is defined as

Crater Extraction Algorithm
The crater image segmentation results were obtained by AFNet. The results included activated pixels corresponding to the locations of the crater rims. We were able to extract crater positions and sizes from the crater image segmentation results through the postpipeline method with the crater extraction algorithm based on the template matching method. Most impact craters are circular on the lunar surface. The craters are detected by the ring feature in the extraction algorithm. However, for overlapping craters, traditional methods (such as Hough transform, Candy) [46] cannot detect rings in the segmentation results efficiently. We used the more efficient match template algorithm in scikit-image [47] (an image processing library implemented in Python programming language) to extract crater positions. This method was used in [36,37] for crater edge extraction.
The proposed CEA received the prediction map I of the crater segmentation network and output the crater evaluation results. The crater extraction pipeline process is as follows. First, a prediction result is filtered by the binary threshold β, described as where p i is the pixel intensity. p i is set to 1 when p i is greater than β; otherwise, p i is set to 0. Then, the match template algorithm is applied to match the crater over a radius range with a maximum radius r max and minimum radius r min . The match template threshold P m is used to choose the high confidence target. Lastly, an evaluation of whether the crater is correctly identified is carried out. We detected the minimum radius r min of the craters as 5 km and the maximum radius r max as 40 km from the network prediction result by the CEA. This algorithm iteratively slides generated rings through the target, and it calculates the match threshold at each (x, y, r) coordinate to eliminate false target results, where (x, y) is the centralization of the generated ring, and r is the radius. Any (x, y, r) ring with a match probability greater than P m is classified by the coordinate and radius constraints to get the correct crater, expressed as |r i −r j |/min(r i ,r j ) < D r where (x i , y i ) is the position of the crater c i extracted from the prediction image I, x i , y i are the latitude and longitude of I, respectively, and r i is the radius of the crater c i . For the ground-truth imageĨ, (x i ,ỹ i ) presents the position corresponding to the crater c i ,x i is the latitude of the crater,ỹ i is the longitude of the crater, and the radius of crater c i isr i . D x,y is the error threshold of the longitude and latitude, and D r is the radius error threshold. When the detection crater meets these limits, it is regarded as the correct crater; otherwise, it is considered a false crater. The pseudo-code of the efficient lunar crater detection ELCD algorithm includes crater edge prediction by the semantic segmentation network AFNet and the post-pipeline method with CEA, as described in Algorithm 1.
The input of the network contains the test DEM data Y with a pixel size of 256 × 256 for the DEM image, the number of batch image processes |Z( k)|, the crater classification number N class , the trained network model M, and the ground-truth of the crater image Y. The outputs are the position and size of the crater and the evaluation of the crater detection results. First, the batch data Y(i) of crater images in test set Y are transferred to the trained model M by the AFNet to obtain the prediction results pred dem of the network. Then, the prediction feature map pred dem is processed by binary threshold processing β, using the match template threshold P m to filter out matching craters. The correctly identified craters are evaluated by the error constraints shown in Equations (7) and (8), and the results of the evaluation are counted using statistical functions Count(). Finally, the position and size of the crater Pos and the evaluation results Det of the correctly identified craters are obtained using the mean results for the test crater DEM data Y. // the information about the crater's position and size Det = [];

Experiments
In this section, we describe the experiments conducted to verify the performance of the proposed algorithm. The experiments involved the experimental setup, experimental datasets, evaluation metrics, and comparison algorithms. The details are given below.

Experimental Setup
The experiment was performed on a single GPU (NVIDIA GeForce RTX 3060, 64GB RAM, 8 core CPU) with CUDA 11.0 and PyTorch 1.7.1. The CE-Adam [48] optimizer was used to improve the capability of the network model, and the learning rate was set to 1 × 10 −4 . The number of iterations in the network was set to epoch 100, and the batch size was set to 32. We conducted crater detection experiments on the lunar DEM datasets, where the input DEM image was 256 × 256 pixels in size. The crater edge semantic segmentation network AFNet and crater edge extract results were evaluated using relevant evaluation criteria, as detailed in Section 2.6.3.

Datasets
In our experiment, we used lunar DEM data from the Lunar Reconnaissance Orbiter (LRO) and the Kaguya merged digital elevation model. The resolution of the DEM was about 59 m/pixel [49], and it spanned 180 • W to 180 • E and 60 • S to 60 • N . The global DEM map was downsampled to 118 m/pixel with a size of 92,160 × 30,720 pixels. This was used to randomly generate crater images that were 256 × 256 pixels in size.
Two lunar crater catalogs were used for the ground truth. The first catalog was termed Head [10], where the size of the crater was larger than 20 km in diameter. The other catalog was taken from Povilaitis [9], and the crater diameter size was 5-20 km. We used the combined catalog, termed Head-LROC, to train our model in this paper. The total numbers of Head and Povilaitis craters were 5186, and 19,337, respectively. The different distributions and diameter sizes of craters based on the Head-LROC catalog are shown in Figure 7. We can see that around 51.5% of craters had a diameter of less than 10 km, which accounts for more than half of all data. Moreover, around 78.8% of craters had a radius of less than 20 km, representing about three-quarters of all crater data. Only 1.3% of craters had a radius of greater than 100 km. In the experiment, the original crater images and ground-truth images were generated by the global DEM map and two lunar crater catalogs. The numbers of generated training sets, validation sets, and test sets were 30,000 DEM images, 3000, and 3000, respectively. The training set was processed by the random invert method. We randomly inverted θ to the DEM image using random number probability p, p ∈ [0,1], where θ is defined as

Evaluation Criteria
In two-stage crater detection algorithms, the performance of the prediction network may affect the final crater edge extraction result. When other parameters were fixed, the clearer the crater edge was segmented, the better the crater edge extraction result was. Thus, we first evaluated the performance of the proposed crater edge segmentation network, AFNet. The four metrics from common semantic segmentation criteria [23,24] were used to evaluate the proposed network model. We computed four metrics, the pixel accuracy (PA), mean pixel accuracy (MPA), mean intersection over union (MIoU), and frequency weighted intersection over union (FWIoU), to evaluate the performance of AFNet. Via an ablation study, we can prove the validity of our proposed model and the improved crater edge segmentation loss function.
After obtaining the crater image segmentation results, the crater positions and sizes can be obtained through the crater extraction algorithm. To evaluate the crater detection performance of the proposed ECLD algorithm, we used an evaluation method that is commonly used in machine learning to evaluate the precision (P), recall (R), and F λ -score (F 1 or F 2 ) for each identified crater basis. The detection precision is the ratio of matching numbers N match to detection numbers N detect of craters. The recall was computed by the ratio of matching numbers N match to the number of human-annotated N csv , and the F λscore was used to balance the precision and recall. For the F λ -score, λ denotes the tune parameter. When λ > 1, the recall is more important; otherwise, when λ < 1, the precision is more important for the model's evaluation. The detailed calculation process is described in [35,36].
Many truly existing craters were not marked in the ground truth; they were regarded as false negatives. In addition, in this paper, we used the combined lunar crater catalog Head-LROC [9,10]. The label of the training dataset was incomplete in the crater catalog, and some newly discovered craters were identified through network prediction. We calculated the discovery rate, that is, the false-positive rate for crater recognition. We used two methods to evaluate newly discovered craters. R 1 new , R 2 new was computed as where R 1 new denotes the ratio between the newly discovered craters and all recognized craters. TP denotes true positives and FP denotes false positives. The second evaluated method used was R 2 new , which shows the proportion of newly discovered craters to all impact craters, and FN indicates false negatives.
In the process of lunar crater recognition, the performance of the model was evaluated from the accuracy computation by the positions and sizes of the recognized craters. We calculated the latitude error (E lo ), longitude error (E la ), and radius error (E r ) to evaluate the network model using E la = abs(la p − la t ) 2 × (r p + r t ) (13) where lo p denotes the predicted longitude value, and lo t is the corresponding true longitude value of the crater. In Equation (13), la p is the latitude value of the predicted crater, and the latitude value of the corresponding true crater is denoted as la t . The radius error (E r ) was calculated as follows: where r p denotes the radius of the predicted crater, and the corresponding true radius of the crater is indicated as r t .

Compared Algorithms
The proposed algorithm ELCD was compared with five different crater detection algorithms using image segmentation technology that contained DeepMoon [35], ERU-Net [36], D-LinkNet [23], and SwiftNet [24]. The general procedure used for each algorithm was as follows: • DeepMoon [35]: The basic idea of this algorithm is that deep learning based on the U-net network architecture is used to train the lunar crater DEM data to discover lunar craters. • ERU-Net [36]: To improve the detection accuracy of lunar craters, ERU-Net introduced the residual network module to the U-Net network architecture to enhance the crater feature extraction ability. • D-LinkNet [23]: D-LinkNet with high efficiency is often used for comparisons in crater detection. D-LinkNet is a semantic segmentation neural network that combines the encoder-decoder structure, dilated convolution, and a pre-trained encoder to carry out road extraction tasks. • SwiftNet [24]: To verify the inference speed of the proposed model, we added SwiftNet to compare the network models. SwiftNet is a real-time semantic segmentation method based on residual network frameworks, which can achieve real-time detection for road-driving images.

Ablation Study
The ablation study on the AFNet explored the influences of different network structures and loss functions on the crater recognition accuracy. The proposed modules and three loss functions (LFs), CE, FL, and the proposed loss function CESL, were compared in the ablation study. The comparison network was initialized by using VGG-16 pre-training weights and normal initialization, where the denotes the use of the module, and VGG-16 denotes the basic network structure to give a better comparison. The results of the ablation study were obtained by evaluating PA, MPA, MIoU, and FWIoU in the crater validation data. The results are shown in Table 1, and the values in bold are the best values in each compared column.
In Table 1 We also show several feature maps of a crater image sample at decoder4 with the VGG-16, VGG-16-ECA, and AFNet network structures in Figure 8. We found that the output features had a clear distinction in AFNet and VGG-16-ECA compared with VGG-16. Some chance information was strengthened, while other chance information was weakened. AFNet and VGG-16-ECA included the attention mechanism ECA, which strengthens some important features to quickly distinguish the edges of craters from their backgrounds.

The Evaluation Results for AFNet
In the iterative process of network training, the values of PA, MPA, MIoU, and FWIoU for AFNet in the validation set are shown in Figure 9. The accuracy of all evaluation criteria increased with the epoch. When the network was in about epoch 35 of network training, the network began to converge. The proposed model achieved a pixel accuracy of 96.8%, as shown in Figure 9a; the mean pixel accuracy was 82.8%, and the MIoU was 75.2%, as shown in Figure 9b. The FWIoU was 94.3%, as shown in Figure 9c. The training loss of the AFNet is shown in Figure 9d. We can see that the initial loss function was very small under the VGG-16 pre-training weight initialization, and the network had a faster convergence speed to allow it to obtain the best performance. The network prediction results with AFNet are shown in Figure 10. The top figure denotes the ground truth of the DEM images, and the bottom figure shows the edge segmentation results. In lunar catalogs, some crater labeling is incomplete with small and shallow craters missing, and some obvious craters are not labeled, which may affect the crater detection accuracy. However, AFNet was used to recognize the crater edges through the classification of each pixel. We can see that the proposed AFNet network was able to segment crater edges with different characteristics.

The Evaluation Results for the ELCD
We evaluated the performance of the ELCD based on the edge segmentation network and crater extraction algorithm by detecting the crater radius, latitude, and longitude. Moreover, we computed the precision, recall, F 1 , F 2 , and the errors in the latitude, longitude, and radius of the crater for the match template method. In order to compare with other crater methods, we calculated the detection results of craters with a radius of 5-40 km. The error threshold of the longitude and latitude D x,y was set to 1.8, the radius error threshold D r was set to 0.1, and the binary threshold β was set to 0.1. We tuned the match threshold P m of the match template. For further details about the parameter setting process, refer to [35]. We evaluated the various metrics when the parameter of the match template threshold P m ranged from 0.3 to 0.8 with an interval of 0.05. The average crater edge extraction resulted in different match threshold values P m , as shown in Table 2. The best value in each compared row is presented in bold, and the gray column indicates the best tuning parameters.
In Table 2, we can see that the values of precision, F 1 , and F 2 increased as P m increased, while the values of recall and other metrics decreased as P m increased. A high precision rate of 92.1% was obtained when P m was 0.75 and the error values of E lo , E la , and E r were also minimal. When γ was set to 0.3, the value of recall was maximal and more new craters were obtained under the maximum error values of E lo , E la , and E r . New craters accounted for 41.9% and 70.2%, as shown by R 1 new and R 2 new . F 1 can balance the value of precision and recall. The best F 1 was 79.4% when P m was set to 0.5, where the precision was 80.6%, the recall was 81.9%, and the error values of E lo , E la , and E r were relatively small, at 12.0%, 9.8%, and 6.6%, respectively. F 2 pays more attention to the recall evaluation. When P m was 0.45, F 2 obtained the best value of 80.9%. In this paper, in accordance with [35,36], we used F 1 and F 2 to evaluate the ELCD algorithm.
The precision and recall curves of the ELCD algorithm are shown in Figure 11, where the upper green triangle represents the maximal point, and the yellow triangle denotes the minimal value point. Figure 11a is the score of precision and recall with the different match thresholds P m . The focus of these two lines is that P m is about equal to 0.5, which is a balance point between precision and recall. The relation curve of the precision and recall curves is shown in Figure 11b. (a) P/R score with the P m (b) P − R curve Figure 11. Precision/recall curve for the crater detection results.

Comparison of Multiple Crater Detection Methods
In this section, we present an evaluation of the comparison results with ELCD under different crater detection methods using the test set. P m = 0.5 is balance point between precision and recall. As shown in Figure 11a, we used the result where P m was 0.5 as a comparison of ELCD. We also measured the computation complexity with different network architectures. In this paper, the billions of floating-point operations (FLOPs), network parameters (Params), and the number of processed frames per second (FPS) were used to evaluate the computational complexity of the trained networks. In the FPS computation, in accordance with [24], we set the test batch size as 1.
The average crater extraction results under various crater detection algorithms are shown in Table 3. In Table 3, we can see that the DeepMoon increased the recall and the proportion of newly discovered craters, and ERU-Net obtained a low detection error for the crater radius, respectively. SwiftNet and D-linkNet had relatively poor detection accuracy levels, but they had the lowest FLOPs and network parameters. The crater detection algorithm required not only a high detection accuracy due to autonomous landing requirements for deep space probes in the deep space environment, but the crater detection algorithm should have a fast detection speed. The SwiftNet and D-linkNet network structures were designed for the real-time target detection of road-driving images. They have fewer parameters, low FLOPs, and high FPS during the running of the network to meet the needs of real-time detection. However, as the SwiftNet and D-linkNet network structures are simple network structures, they are inefficient for complex crater detection problems, and they perform poorly in lunar crater detection compared with other networks such as DeepMoon, ERU-Net, and the proposed algorithm. DeepMoon and ERU-Net achieved good crater detection results compared with the SwiftNet and D-linkNet network structures, but they require more computational resources, and the network computation speed of FPS is also lower. In Table 3, the proposed algorithm is shown to achieve better crater detection precision (P) and F 1 , F 2 scores than the DeepMoon, SwiftNet, D-linkNet, and ERU-Net network structures with minimal E la and E r errors. Moreover, ELCD has a faster inference speed than the other algorithms. The proposed model combines the encoder, feature fusion, and decoder processes to achieve good network parallelism to speed up the network inference speed. The proposed ELCD has lower FLOPs than the DeepMoon and ERU-Net methods, and the total FLOPs in ELCD were shown to be about 1.7 times and 4.1 times lower than the values of DeepMoon and ERU-Net, respectively. For the FPS measure, although the parameters of the ELCD were not lower than those of DeepMoon and ERU-Net, the total FPS of the ELCD was about 8 times and 17 times higher than the values of DeepMoon and ERU-Net, respectively. Thus, the proposed ELCD algorithm achieved the best crater detection results with relatively few parameters and a low network complexity. It can achieve a balance between crater detection precision and network computation efficiency.
A comparison of the results obtained with different crater detection methods is shown in Figure 12. Each row represents the detection result of all compared crater methods for the same types of crater data. Each column represents the performance of the same detection method in different types of craters with varying degrees of classification and distribution imbalance. IR is the classification imbalance ratio, and DR denotes the distribution imbalance ratio, which was computed by the number of craters in each image label. The details are presented in Section 2.4. The greater the DR is, the denser the crater images are, and relatively speaking, the smaller the IR is. The original DEM image shown in Figure 12a,b is the ground truth, and Figure 12a-g denotes the compared algorithms. The blue circle denotes the correctly detected craters, the green circle is the newly detected craters, and the red circle is unrecognized craters. We can see that D-LinkNet and SwiftNet performed poorly for crater detection, especially for dense crater data. There are many incorrectly detected craters marked as red circles in Figure 12f,g. DeepMoon and ERU-Net could detect most of the labeled craters in contrast to D-LinkNet and SwiftNet, but they performed poorly for large craters. For example, in IR = 9.3, DR = 49 and IR = 7.5, DR = 49, DeepMoon could not detect the large crater that is represented by the red circle in Figure 12d,e. In the third column, we can see that the proposed model increased the accuracy of crater detection compared with the other models for craters of different densities and sizes, as shown in Figure 12c. Moreover, the proposed model was able to detect some new unlabeled craters. However, small craters with a high degree of overlap in the DEM data were difficult to identify with high precision using DEM data for all compared algorithms. The proposed model regarded such craters as noise and could not detect them well. The detection results obtained with DeepMoon based on U-net [35]. (d) The recognition results obtained with the ERU-Net network [36]. (e) The detection results obtained with D-LinkNet with the ResNet-18 network [23]. (f) The detection results obtained with SwiftNet (g), designed by the paper [24]. In the figure, the blue circles represent correctly recognized crates, the green circles denote new craters discovered by compared methods, and the red circle indicates unrecognized craters.

Discussion
With the application of deep learning techniques, great progress has been made in automated impact crater detection. The proposed method builds an efficient crater edge prediction network with a lightweight attention mechanism module and a multiscale feature fusion module to recognize crater edges from digital elevation models. The experimental results show that the presented method achieves high precision and recall rates and a fast detection speed when undergoing lunar crater detection, mainly due to the following reasons: (1) we used the digital elevation model as the crater data, which contain abundant 3D morphology and topography morphological characteristics and are insensitive to light; (2) the proposed crater edge segmentation network is an efficient model to improve the accuracy of crater detection. The proposed network uses a lightweight attention mechanism module to enhance the feature extraction capability of the network encoder and designs a multiscale feature fusion module that fuses multi-level different resolution feature maps to reduce information loss in the network encoder; and (3) considering the imbalance of classification and different density distributions of craters, we proposed an efficient crater edge segmentation loss function to optimize the network performance.
In the experimental results, Table 1 shows that the multiscale feature fusion module can increase the crater detection accuracy, and it shows that the proposed crater loss function can achieve the best crater edge segmentation results. Figure 8 shows that the attention mechanism module can strengthen some chance information about craters and weaken other chance information, which can strengthen the importance of crater features to allow the edges of craters to be quickly distinguished from their backgrounds. Figure 9 shows that the CESL can improve the ability of the network to obtain optimal solutions and can speed up the convergence of the improved model. The final crater detection results show that the proposed model, which includes the attention mechanism module and the multiscale feature fusion module, can achieve more fine-grained segmentation for crater edges with different characteristics, as shown in Figure 10. In Table 3 and Figure 12, which shows a comparison of the different crater detection methods, the proposed model is shown to achieve the best detection performance with minimal errors in E la and E r . Compared with other real-time target detection methods, this method has a faster reasoning speed. Compared with the survey of the global lunar orbiter laser altimeter (LOLA) dataset of the Moon, the algorithm can detect the marked craters on the lunar surface more accurately and can detect some undiscovered craters. There are some false and ambiguous markers in the global LOLA dataset, and the proposed algorithm can correct false positives in the original data. Moreover, the newly discovered craters can increase the size of the original data set.
The discovery of impact craters is important for studying the evolution of the Moon. There are many small craters on the Moon's surface, and they influence the estimation of the Moon's age. However, the study still has some limitations with regard to small crater detection. Most crater digital elevation models have a lower resolution than the optical image and other higher-resolution images. Some craters that are too small appear as points in DEM images, and they are likely to be ignored or considered to be noise and thus cannot be detected successfully using a digital elevation model. The optical image has a high resolution, but it is sensitive to illumination. Thus, determining how to avoid the impact of light on impact craters in optical images or fusing the optical image and the digital elevation model to improve the small crater detection accuracy deserve further attention in the future.

Conclusions
In this paper, an efficient lunar crater detection algorithm, AFNet, based on the segmentation convolutional neural network was proposed to improve the crater detection accuracy and speed. Based on the VGG-16 network architecture, a lightweight attention mechanism module was introduced to enhance the extraction of important crater features in the network encoder. The proposed model uses a new feature fusion method that fuses multi-level different feature maps obtained from the network encoder to reduce the information loss of the output map in the network decoder. Then, considering the classification and distribution imbalance of the crater data, the crater edge segmentation loss function was used to improve the optimization performance of the proposed model. Last, the crater positions were extracted by the crater edge extract algorithm based on the match template method. The proposed model was applied to two crater catalogs and compared with four state-of-the-art crater detection algorithms. The results demonstrate that the ELCD achieved an inference speed of about 73 HZ and a precision of 80.6% for lunar crater detection in a DEM image with 256 × 256 pixels on GeForce RTX 3060, and it obtained the best accuracy of 79.4% for F 1 and 80.6% for F 2 compared with the other crater detection models. Moreover, the ELCD can be used to discover new craters and expand the size of the original data set. It is hoped that this algorithm will further improve the accuracy of lunar age estimation and the positioning accuracy of spacecraft landing. For future work, the network structure should be further optimized so that the model can improve its real-time detection speed and achieve a high crater detection accuracy in the detection of impact craters of different sizes.  Acknowledgments: The authors thank R. Povilaitis and J. Head for providing the 5-20 km and >20 km dataset of lunar craters. In addition, the authors would like to thank the reviewers for their valuable comments and suggestions.

Conflicts of Interest:
The authors declare no conflicts of interest.

Abbreviations
The following abbreviations are used in this manuscript: