A Unified Brightness Temperature Features Analysis Framework for Mapping Mare Basalt Units Using Chang’e-2 Lunar Microwave Sounder (CELMS) Data

: The brightness temperature (T B ) features extracted from Chang’e-2 Lunar Microwave Sounder (CELMS) data represent the passive microwave thermal emission (MTE) from the lunar regolith at different depths. However, there have been few studies assessing the importance and contribution of each T B feature for mapping mare basalt units. In this study, a uniﬁed framework of T B features analysis is proposed through a case study of Mare Fecunditatis, which is a large basalt basin on the eastern nearside of the Moon. Firstly, T B maps are generated from original CELMS data. Next, all T B features are evaluated systematically using a range of analytical approaches. The Pearson coefﬁcient is used to compute the correlation of features and basalt classes. Two distance metrics, normalized distance and J-S divergence, are selected to measure the discrimination of basalt units by each T B feature. Their contributions to basalt classiﬁcation are quantitatively evaluated by the ReliefF method and out-of-bag (OOB) importance index. Then, principal component analysis (PCA) is applied to reduce the dimension of T B features and analyze the feature space. Finally, a new geological map of Mare Fecunditatis is generated using CELMS data based on a random forest (RF) classiﬁer. The results will be of great signiﬁcance in utilizing CELMS data more widely as an additional tool to study the geological structure of the lunar basalt basin.


Introduction
Accurate mapping of the mineral composition of the lunar surface is crucial for understanding the thermal evolution of the Moon and the formation of its mare and mascons [1,2].However, the presence of nanophase iron (np-Fe 0 ) particles resulting from space weathering, as well as the lunar soil cover, introduce significant uncertainties in the retrieval of the lunar surface mineral composition using multispectral remote sensing data [3].
Up to now, several studies on mare geological structure and chemistry components were conducted, mainly based on FTA (FeO + TiO 2 abundance) data, moon mineralogy mapper (M 3 ) data, UV/VIS (ultraviolet-visible spectroscopy) data, the crater size-frequency distribution (CSFD) technique and so on [4][5][6][7][8][9][10][11].For instance, Karthi et al., mapped the Mare Orientale impact basin using Chandrayaan-1 M 3 data and Lunar Reconnaissance Orbiter (LRO)-Wide Angle Camera (WAC) images [4].Kramer et al., mapped the basalt units of the map of Mare Moscoviense and Mare Nectaris based on FTA data and Clementine UV-VIS data [5].Thiessen et al., mapped Mare Imbrium basalt units using M 3 data [6].Hiesinger et al. studied the ages and stratigraphy of lunar mare basalts in Mare Frigoris and other nearside maria based on CSFD measurement [7].Liu et al., researched the basalt and volcanism chronology of Orientale Basin using CCD (charge-coupled device) imaging [8].Bugiolacchi et al., used near-infrared (NIR) data to gain new geological maps in several lunar areas, such as Mare Imbrium and Tycho Crater, and study mineralogy in the Copernicus Crater [9,10].These studies use visual and multispectral information (from which the mineral data are also derived), which only reflects the condition of lunar surface regolith and limits the acquisition of deep-layer lunar structure information.Bugiolacchi et al. detected the deep lunar structure near the landing site of the Chang'e-4 satellite by lunar penetrating radar (LPR) on board the Yutu-2 rover [12][13][14][15] and acquired the subsurface geological structure on the farside of the Moon for the first time, but these studies were only conducted on a small range of areas on the route of the Yutu-2 rover due to the payload's limitation.
Chang'e Lunar Microwave Sounder (CELMS), which is a unique passive microwave remote sensing device, has provided a complementary tool for investigating the lunar geological structure using brightness temperature (T B ) data obtained by passive microwave radiation from the deep surface and subsurface lunar regolith.The information contained in the brightness temperature is related to the physical and chemical properties of the detected material [16,17].The potential of CELMS data in studying mare basin and crater structures has been demonstrated in many studies.Meng et al. discovered some brightness temperature anomalies in craters and basins based on CELMS data that had not been discovered in traditional optical remote sensing data [18][19][20][21][22][23].This is because CELMS can distinguish the characteristics in different depths of the lunar regolith structure.The causes of some anomalies are still uncertain.Researchers have ruled out the influence of rock abundance (RA), FTA or topography for the origin of these anomalies and hypothesized that they may be originated from some unknown material.However, these pathbreaking works mainly relied on visual interpretation.Most of the research areas were manually analyzed, resulting in time-consuming work.Moreover, these studies lacked quantitative analysis, and few of them provided exquisite geological maps, which limited the significance of these discoveries.
Machine learning models have demonstrated remarkable performance in the surface mapping of the Earth [24][25][26].However, they were seldom applied in mapping mare basalt units.The reason may be the complex feature characteristics and insufficient ground truth data of the lunar basalt units.The number of CELMS brightness temperature channels is much larger, which makes machine learning models well-suited for CELMS data because machine learning techniques are designed for multichannel-input tasks.Manual interpretation of multichannel remote sensing data is both challenging and inefficient for making comprehensive analyses, whereas machine learning techniques can effectively leverage information from all the feature channels to make informed decisions.
In addition, the specific contribution of each brightness temperature feature to mare basalt units classification is still ambiguous.Hence, this study proposes a unified feature assessment framework for CELMS data, which comprehensively evaluates the features from various perspectives, including feature redundancy, feature separability and degree of contribution to classification.This approach provides a more comprehensive and accurate assessment of the individual features, which can help identify the most informative and discriminative features for mare basalt unit classification.
In brief, the main contributions of this study include the following.

1.
A unified framework for assessing CELMS brightness temperature features is proposed to quantitatively analyze the influence of each feature on mare basalt classification.To the best of our knowledge, this is the first systematic framework for assessing CELMS brightness temperature features for lunar basalt unit classification.

2.
The effectiveness of dimension reduction for brightness temperature features is demonstrated by analyzing the vector projection and data distribution in the feature space.

3.
A new geological map of Mare Fecunditatis is generated based on the random forest algorithm using CELMS data.
Overall, this study represents a valuable addition to the field of lunar geology, and the proposed framework and methodology can be applied to other regions for further investigation.
The rest of this paper is organized as follows.Section 2 introduces the data used in this study and research area.The methodology is briefly explained in Section 3. Section 4 presents the experimental results.Section 5 discusses the discoveries in Section 4. Finally, a conclusion is drawn in Section 6.

Dataset
Chang'e 2 satellite, the second lunar orbiter in China's Lunar Exploration Program, was launched on 1 October 2010.A microwave radiometer is one of the payloads of the Chang'e-2 satellite, which receives multiband microwave radiation signals from the lunar surface and subsurface.Its operation frequency includes 3.0 GHz, 7.8 GHz, 19.35 GHz and 37.0 GHz, with the bandwidths of 100 MHz, 200 MHz, 500 MHz and 500 MHz, respectively [16,17].The corresponding depths of penetration on the Lunar surface are 1-2 m at 3.0 GHz, 38.5-75 cm at 7.8 GHz, 15.5-31 cm at 19.35 GHz and 8.1-16.2cm at 37 GHz, respectively [23].The original data used in this research are the 2C level (the system correction, geometric positioning and brightness temperature inversion are completed), with the acquisition time ranging from 15 October 2010 08:50:02 to 20 May 2011 12:51:50.The error margins of all the frequencies are less than 0.5 K.The original CELMS data can be obtained on request from the Lunar and Planetary Data Release System (https://moon.bao.ac.cn, accessed on 30 March 2022).

Study Area
Mare Fecunditatis, centered at 7.8 • S, 53.7 • E, is a low-latitude basin on the eastern nearside of the Moon, to the southeast of Mare Tranquillitatis, which is filled with upper Imbrian basalt material, covering an area of approximately 310,000 km 2 .The largest span of Mare Fecunditatis is about 909 km.The basin of Mare Fecunditatis was excavated in the pre-Nectarian period, while the mare basalts infills occurred during the Imbrian period.The volcanic activity continued until the early Eratosthenian period [27][28][29].Accordingly, research on the geological units of Mare Fecunditatis will be of great significance to understanding the history of volcanic activity and promoting lunar exploration.
In this paper, we utilized the original CELMS data in the time spans from 13 to 14 o'clock as the noon brightness temperature T B noon and 0 to 1 o'clock data as the midnight brightness temperature T B midnight to generate brightness temperature maps.These time periods correspond to the highest and lowest temperatures on the lunar surface, respectively, and are therefore ideal for capturing the special microwave thermal emission (MTE) characteristics and thermophysical features of the basaltic units [23].In addition, due to the limitation of the satellite orbit, only at this time can enough data points be obtained to support the study.The spatial resolution of CELMS data is 0.25 • × 0.25 • , about 7.5 × 7.5 km in Mare Fecunditatis.In our previous work, it was proposed that the difference between the noon and midnight brightness temperatures (written as dT B ) has great potential for observing mare units [18][19][20][21][22][23]: where T B noon and T B midnight denote the noon and midnight brightness temperatures of the same frequency channel, respectively.The heatmaps of all brightness temperature features in Mare Fecunditatis are shown in Figure 2. Kramer et al., divided Mare Fecunditatis into five units, according to the FTA (FeO + TiO 2 abundance) [30]-Ihtm (a late(?) Imbrian, mid-to high-Ti, high-Fe mare basalt), Chtr (high-Ti spread), Iltm (an Imbrian-age, low-Ti, mid-Fe mare basalt), Im (an early(?)Imbrian-age, very low Ti, low-Fe mare basalt) and Cc (crater ejecta)-which is illustrated in Figure 3. Ihtm, Chtr, Iltm and Im can be considered as four phases of mare basalt.Cc is the ejecta of the craters, such as the Langrenus Crater and the Taruntius Crater, over the basalt units.Unlike other mare basins, Mare Fecunditatis is affected by a large crater-Langrenus.The ejecta of Langrenus spreads over the east part of Mare Fecunditatis and covers the mare basalt.In other words, this part combines crater ejecta and mare basalt.
According to the class proposed by Kramer et al. [30], in this paper, we divided the basalt and ejecta units into five parts.As shown in Figure 4, the red, green, blue, cyan and yellow labels stand for the sample area of Ihtm, Chtr, Iltm, Im and Cc, respectively.The black line represents the boundary of geological units by Kramer et al.To map Mare Fecunditatis and its vicinity comprehensively in the following sections, we also labeled an individual class named "Non-Mare Basalt Units", which includes non-basalt components, such as highland and Copernican craters.

Overall Framework
The overall framework of this paper is illustrated in Figure 5.The procedure include the following.

1.
Preprocessing and brightness temperature feature exaction The noon and midnight T B maps of Mare Fecunditatis with the frequencies of 3.0 GHz, 7.8 GHz, 19.35 GHz and 37.0 GHz are generated based on original CELMS data, and dT B maps are generated based on the difference between T B noon and T B midnight .

Feature analysis
All brightness temperature features for distinguishing five mare geological units are evaluated in three aspects, including correlation analysis, distance metrics and contribution to classification.The Pearson coefficient is employed to assess feature redundancy and class linear separability.Normalized distance (ND) and J-S (Jensen-Shannon) divergence measure the Euclidean distance between two classes in the feature space and histogram separation, respectively.ReliefF and the out-of-bag (OOB) importance represent the contribution to classification for machine learning in different terms.

Dimension reduction and classification
The random forest (RF) classifier is used to map Mare Fecunditatis.Principal component analysis (PCA) is conducted on 12 original features to reduce the dimension and analyze the feature space.A new geological map of Mare Fecunditatis is generated by CELMS data, based on the supervised machine learning method.

T B and dT B Feature Extraction
The T B noon , T B midnight and dT B features with different frequencies are generated based on CELMS data (12 features in total), which were introduced and displayed in Section 2.2.

Pearson Coefficient
In this study, we utilize the Pearson coefficient to quantify the level of correlation between each feature and the linear separability among classes.
The Pearson coefficient measures the degree of linear correlation, defined as the quotient of covariance and the standard deviation between two variables [31]: where Cov is the covariance between variables X and Y, and σ denotes the standard deviation.The Pearson coefficient ranges from −1 to 1, and its absolute value close to 1 reflects high correlation, and close to 0 means an uncorrelated relation.From a feature engineering perspective, a high correlation coefficient typically indicates feature redundancy, while a low correlation coefficient suggests independent features.
In terms of class-by-class correlation, a higher correlation coefficient indicates that the corresponding classes are more difficult to classify.

Distance Metrics
Two distance metrics, normalized distance [32] and J-S divergence [33], are considered to quantitatively evaluate the degree of separation among T B features.Typically, a higher degree of separation indicates better classification performance in machine learning.

Normalized Distance
Normalized distance is defined as: where µ and σ denote the mean and standard deviation of class 1 and class 2, respectively.Normalized distance measures the Euclidean distance between two classes without the influence of standard deviation, representing the separation of centers between different classes.

J-S Divergence
J-S divergence, an improved version of K-L divergence, indicates the separability of the two histograms.K-L divergence is also called relative entropy, which is defined as: where p and q represent the probability density of two classes.From Equation ( 3), it can be recognized that the order of p and q influences the result of the K-L divergence.J-S divergence solves this problem by using a symmetrical weighting operation: The larger J-S divergence value means a higher degree of separation between two histograms or probability density distributions, as well as better classification capability.It has been proved that the J-S divergence value equals ln2 (approximately 0.6931) when two histograms are completely separated [34].

Contribution to Classification
To further understand the contribution and importance of brightness temperature features in the mare basalt classification process using machine learning methods, we employed two feature selection approaches-ReliefF and out-of-bag importance-to assess the contribution.

ReliefF
Relief (Relevant Features) is a filtering feature selection algorithm, in which a kind of weighting statistic method is designed before the features are inputted into the classifier [35].Suppose a training set D: where x and y represent the training data and its label, m, denotes the number of train data.For each x i , the nearest samples x i,nh in the same class of x i are chosen and named "near-hit", then the nearest samples x i,nm in the different class of x i are chosen and named "near-miss", and the weight of the corresponding statistic is updated by: where x a j is the value of sample x a in class j.For continuous data, diff () is defined as: From Equation ( 6), it can be observed that if the distance between x i,nm, and x i is larger than that between x i,nh, and x i , the whole weight will increase and the corresponding feature will benefit the classification.On the contrary, this feature does not contribute much to the classification process.
The Relief method was designed for binary classification.For multiclass classification, its variant ReliefF was proposed.Suppose the samples are from |υ| classes.ReliefF finds the nearest-hit of x i in class k as x i,nh , then finds the nearest-miss in each class, except class k, and is written as x i,l,nm (l = 1,2, . . .,|υ|; l = k).The weight of ReliefF is updated as: where p l is the proportion of samples in class l.

OOB Importance
The OOB error is used for assessing the feature or variable importance in the Random Forest classifier [36,37].It evaluates the change of classification performance by adding noise to features or variables.For a Random Forest classifier [38], not all data are input into each decision tree, but part of the data is randomly drawn.The remainder is called out-of-bag data.If a feature is added in stochastic noise that results in a decrease in accuracy, it can be considered that this feature is important for classification.The OOB importance of a decision tree is defined as the difference between OOB errors, with and without adding stochastic noise: where err 1 and err 2 are out-of-bag errors with and without adding noise.Suppose there are N trees in the Random Forest, the OOB importance of this Random Forest is: The higher OOB importance value of a feature means that noise impacts more on the classification process, indicating this feature is more contributory, and vice versa [39].The details of the Random Forest classifier will be introduced in Section 3.7.

Principal Component Analysis
Principal component analysis (PCA) is a dimension reduction method [35,40] via linearly projecting original data into low-dimension space, whose goal is to find a new feature space with the largest variance so that highly related features are compressed and effective ones can be enhanced.
Suppose the data X n×p contains p variables and n samples, then the covariance matrix Σ is: And the eigenvalue matrix is calculated by: where U = [u 1 , u 2 , ..., u p ] contains the eigenvectors of X. (12) can be detailed written as: where λ 1 , λ 2 , ..., λ p are eigenvalues of X.Note that eigenvalues correspond with eigenvectors.Then PCA data matrix is defined as: where 1 ≤ m ≤ p. m stands for top m principal components (PC).By this step, the data can be reduced to m dimensions [41].The criterion of choosing m is: Commonly, t can be set as 80% to 95% according to the situation.PCA has been widely used in remote sensing image processing to extract effective information from multiple feature inputs [41][42][43].For CELMS brightness temperature data, it consists of different frequencies and acquisition times, forming multiple features.So, introducing PCA is necessary to reduce the feature dimension in this study.

Random Forest
Random Forest (RF) is an ensemble learning method that integrates several decision trees to derive the classification result [38,44].Each decision tree in a random forest is independently established and not related to each other.Once the forest is established, each input sample is judged separately and assigned to a class.Then, majority voting is used to combine all the outputs and predict the class of the input sample.The illustration of RF is shown in Figure 6.Till now, the random forest classifier outperforms other ones in remote sensing image classification owing to its ensemble learning approach [33,[45][46][47].Consequently, we have chosen RF as the supervised classifier to automatically map Mare Fecunditatis in this study.

Statistic Analysis
To understand the distribution of brightness temperatures for each basalt unit in different channels, the normalized histograms of each feature, also named the probability density function (PDF), are shown in Figure 7.In Figure 8, boxplots of the study samples in all features are demonstrated to further visualize and compare the difference in data distribution.From a general perspective, the classes {Ihtm, Chtr} vs. {Iltm, Im, Cc} can be easily distinguished in T B noon features and dT B features.In T B midnight features, all classes are confused, while Iltm, Im, Cc can be separated to some degree.Iltm, Im and Cc may be confused in T B noon features and dT B features, and Ihtm and Chtr are confused in T B midnight features.(

Pearson Coefficient
We use the Pearson coefficient to calculate the feature correlation and class correlation.
From the point of view of machine learning, a high correlation between two features (coefficient approximates 1 or −1) indicates redundancy, while a coefficient around zero suggests independence.To study the extent of feature redundancy, the feature-by-feature correlation is calculated, and the corresponding heatmap is generated in Figure 9, which reveals the relation among all the brightness temperature features.The class-by-class correlation coefficient is also analyzed and shown in Figure 10.We found that an extremely high correlation exists among Iltm, Im and Cc, indicating that distinguishing these three basalt units in CELMS data would be challenging.In addition, the correlation decreases in the order of Ihtm → Chtr → Iltm → Im → Cc, which is highly consistent with the phase of the mare basalt eruption and formation.

Distance Metrics
Table 1 presents the normalized distance results calculated class by class.In general, T B noon features tend to be appropriate for classifying the majority of basalt units, except for Iltm, Im and Cc.On the other hand, T B midnight features are effective in distinguishing these three units (except high-frequency T B midnight features for Im-Iltm).dT B features are generally not suitable for distinguishing basalt units among Iltm, Im, Cc and Ihtm-Chtr, while they are quite effective for other classes.Especially, dT B 37.0 GHz is suitable for distinguishing Cc-Iltm and Im-Iltm, and dT B 3.0 GHz is not quite adept at classifying Iltm-Cc and Cc-Im.Moreover, for all features, the normalized distance values between Ihtm and Chtr are all smaller than that between other classes, indicating that they are difficult to distinguish.
In Table 2, J-S divergence results are presented.Unlike normalized distance, J-S divergence measures the separability of the histogram.Note that the J-S divergence equal to 0.6931 means the histograms are completely separated [34], which was introduced in Section 3.4.2.For T B noon features, their J-S divergence values are higher than T B midnight features for most classes, which further demonstrated the effectiveness of T B noon .For T B midnight features, their separability of Cc-Im and Cc-Iltm performs significantly better than T B noon and dT B features.dT B features show their unique superiority for distinguishing most of the basalt classes, except Cc, Im and Iltm.In addition, high-frequency dT B features are fit to classify Cc-Iltm and Im-Iltm.

Contribution to Classification
In this section, two feature selection methods-Relief(F) and OOB importance-are applied to quantitatively evaluate the features' contribution to classification.Their principles are quite different: ReliefF is a filtering selection method, measuring the contribution based on the average Euclidean distance, while OOB importance is a wrapper selection method, indirectly assessing the feature importance by adding stochastic perturbation.
All brightness temperature features are assessed in two ways, namely, class-by-class and all-classes calculations.Tables 3 and 4   However, interestingly, the class-by-class OOB importance values of T B midnight features are equal to 0 in most cases.Nonetheless, the all-classes OOB importance values (shown in Figure 11) of T B midnight features are higher than those of T B noon features.This phenomenon can be attributed to the superior ability of the T B midnight features to distinguish Iltm, Im and Cc.In other words, other than distance metrics-based methods, such as ReliefF, OOB importance takes the "unique contribution" (a contribution that cannot be provided by any other features) into consideration to a larger degree.For the class pairs Ihtm-Iltm, Ihtm-Cc, Chtr-Cc and Chtr-Iltm assessed by OOB importance, the T B noon features perform better than T B midnight features, proving that T B noon features have advantages over T B midnight features in classifying these classes, while T B midnight features are more suitable for distinguishing Cc-Im and Cc-Iltm.Especially, dT B 37.0 GHz performs best for distinguishing Cc-Iltm, and dT B 37.0 GHz contributes little to the classification of Cc-Im.
In summary, from a comprehensive perspective of Sections 4.3 and 4.4, we can draw a conclusion that T B noon features and dT B features are suitable for classifying Ihtm-Iltm, Ihtm-Cc, Ihtm-Im, Chtr-Cc, Chtr-Iltm and Chtr-Im, but distinguishing Iltm, Im and Cc mostly depends on T B midnight features, and dT B features are unsuitable for distinguishing Cc-Im, especially at 3.0 GHz.Plus, Ihtm and Chtr units have similar components and structures, so they are not easy to separate in terms of most features.However, it should be noted that the 3.0 GHz T B noon feature has a significant and unique advantage in distinguishing them.

Dimension Reduction and Classification
To map the basalt phase distribution in Mare Fecunditatis, the Random Forest classifier, which has been widely used for remote sensing image classification [33,[45][46][47], is employed to map Mare Fecunditatis based on the above-mentioned Chang'e-2 brightness temperature features.The only hyper-parameter in RF is the number of trees, which is set as 125 empirically.The training set is the same as the study sample depicted in Figure 4.
We input all original brightness temperature features into a Random Forest classifier, and the result is presented in Figure 12, which is highly consistent with the geological map presented by Kramer et al. [30] shown in Figure 2. The boundary of geological units by Kramer et al. are marked by black lines.In our result, Ihtm, Chtr, Iltm, Im and Cc are colored red, green, blue, cyan and yellow, respectively.The Ihtm, Chtr, Iltm and Cc units mapped by us and Kramer et al. are quite similar, while the Im unit is different to some extent.The other difference is that in our mapping result, the Taruntius Crater is filled with Iltm basalt, which is out of discussion in the map by Kramer et al [30].Generally, the basalt phase in Mare Fecunditatis exhibits a ring-shaped distribution.Ihtm is concentrated in the center, and Chtr and Iltm form the outer rings.Im distributes to the south of Mare Fecunditatis, while Cc surrounds the Langrenus Crater and the Taruntius Crater.This is completely in accordance with the volcanic eruption period and the igneous process [30].

ff ff
In the above sections, it was demonstrated that both redundant and irrelevant information exists in T B and dT B features.To make full use of Chang'e-2 CELMS data, reduce computational complexity and avoid the "curse of dimensionality", PCA, a classical dimension-reduction method, is considered before applying the RF classifier.We conducted the PCA algorithm on all 12 brightness temperature features.The data after PCA still have 12 dimensions.In Figure 13, the contribution rate of PCs, which is the variance of each PC (equivalent to the eigenvalue of the covariance matrix) that has been introduced in Equation ( 16), is shown by the blue bar.The cumulative contribution rate is shown in the orange line.The top 2 PCs hold a cumulative contribution of 99.08%, which demonstrates the strong redundancy that exists in the original features and explains the necessity of applying PCA.After the 3rd PC, the contribution rates drop, indicating the 3rd to 12th PCs hold so little effective information that their contribution is weak.Note that the variance of the 5th to 12th PCs is so low that they can be ignored, so we do not display them there.To better interpret the results of PCA, we use biplots to express the PCA feature space, which consists of data scatter points and vector projection.The biplots of the top three PCs are shown in Figure 14.The 1st-2nd PCs plane (Figure 14a) is first visualized, where the main contribution is concentrated.The gray line indicates the feature vector projection in the PCA space, and the red, green, blue, cyan and yellow points stand for the data distribution of Ihtm, Chtr, Iltm, Im and Cc, respectively.The orientation and length of the vectors stand for the degree of relation between the original features and PCs.The projection length of the feature vectors on PCs is called the loading coefficient.Generally, the T B noon and T B midnight features point to the positive direction of the 1st PC and the negative direction of the 2nd PC, respectively.Hence, the T B noon features are propitious to classifying {Ihtm, Chtr} vs. {Iltm, Im, Cc} groups, while T B midnight features are fit to distinguish Im and Cc.Especially, 37.0 GHz T B noon greatly impacts the first PC, but hardly influences the second PC, and 37.0 GHz T B midnight impacts the second PC greatly, but hardly affects the first PC, demonstrating their strong ability on classification.These conclusions mutually verify the findings drawn in the previous sections.
The angle between vectors reflects the correlation.Features among dT B , T B noon and T B midnight have a strong correlation.Features between dT B and 19.75 GHz T B noon , as well as dT B and 37.0 GHz T B noon , are also highly correlated.Moreover, the lower the frequency, the stronger the correlation.These conclusions are all in accordance with the analysis by Pearson coefficient as mutual-validation results.Ihtm and Chtr have similar components and basalt phases, so part of their samples is confused, but most of them can be distinguished in the 1st and 2nd PCs.In Figure 14b, the second PC has a relatively poor capability for classification.It can only effectively separate Cc vs. {Ihtm, Chtr} and Iltm vs. {Ihtm, Chtr}.Im can be easily classified from other classes in the 3rd PC.In Figure 14c, it is obvious that the 3rd PC contributes quite weakly compared with the 1st PC: the distribution interval of data in the 3rd PC is quite limited, and the direction of the feature vector centralizes to the 1st PC.We classify the basalt units by jointly using PCA and RF.The top 2, 3, 4, 6, 8 and 12 PCs are selected for training the RF, and the classification results are shown in Figure 15.Obviously, in the classification result of the top two PCs, the effect of the Im unit is unsatisfactory, while the top three or four PCs produced good results, as the Im unit can be effectively separated from the other units, which verified the previous analysis.However, the results of the Taruntius Crater and its vicinity are largely different.The Taruntius Crater and its ejecta were classified as Iltm and Cc units, which is consistent with the original feature mapping result in the classification using the top two PCs.In other results, this area is misclassified or incompletely mapped to a certain degree.The mapping result of the Ihtm unit, Chtr unit and much of the Iltm unit are similar in the classification with all the PCA results.Moreover, the results of the top 6, 8 and 12 PCs make nearly no difference for their weak contribution.Based on the above analysis, we suppose that the optimal choice of the number of dimensions in PCA is three, which is a good balance between low redundancy and high classification accuracy.
The proportions of geological units in Mare Fecunditatis and its vicinity mapped by each method are counted, and the corresponding bar charts are presented in Figure 16.Generally, the proportion of the geological units listed in descending order is Chtr, Iltm, Cc/Im, Ihtm.We also used the combination of T B noon features, T B midnight features and dT B features to map Mare Fecunditatis, and the results are shown in Figure 18.For T B noon features, their result confuses Cc and Iltm largely.The mapping result of T B midnight features appears chaotic, with a large-scale misclassification of the Chtr unit.Yet, the Cc and Im units are wellclassified, which confirms the conclusions drawn in Sections 4.3 and 4.4 For dT B features, the Im and Cc units are confused to some degree.Overall, our analysis demonstrates that each feature has its unique strengths and limitations in mapping mare units.

Discussions
In this paper, a unified framework of assessing CELMS data for mapping mare basalt units is proposed, followed by a case study on Mare Fecunditatis.To the best of our knowledge, this is the first study that systematically analyzes the brightness temperature features of Chang'e-2 CELMS data for each mare basin unit.Based on the experimental results, some conclusions can be drawn: features.The possible explanation is that the different Ti and Fe content lead to varying cooling rates during the lunar night.

2.
The frequency range of the observation influences the capabilities in distinguishing different basalt units.High-frequency features with a shallow penetration depth, especially 19.35 GHz, can better map Mare Fecunditatis with fewer classification errors in Cc, Im and Iltm units.The possible explanation is that Im and Iltm are early-age(?)mare basalt, and together with the crater ejecta Cc, contain less Ti and are highly dielectric.Yet, 37.0 GHz features may be influenced by the lunar dust from the regolith, leading to a slightly poorer result.Low-frequency features (especially 3.0 GHz) may classify certain parts of mare basalt as Cc units.We suppose that the possible reason is that the deep layer of Mare Fecunditatis is the impact basin (interpreted as Cc unit) and was not filled by magma when the Mare Fecunditatis was formed (may be in the Imbrian period 3.2~3.85Ga before) [27].

3.
The advantage of utilizing the dT B features proposed in our previous researches [18][19][20][21][22][23] in distinguishing most classes in CELMS data is confirmed in this study.dT B features are verified for their strong ability to eliminate the latitude difference and strengthen the difference of cooling effects of various basalt units.We also discovered that dT B features have a stronger ability to distinguish mare basalt, especially between early(?)and late(?)-aged basalt, in certain aspects.4.
Redundancy exists universally in CELMS features.This study points out the need to conduct dimension reduction on CELMS features.We discovered that only 3 PCs in the PCA feature space can represent almost all 12 original features for the first time.After dimension reduction, the difficult-to-identify Im unit hidden in the south part of Mare Fecunditatis is better classified.The scatterplots in PCA support this phenomenon.

5.
A new geological map of Mare Fecunditatis is generated in this study based on CELMS data by using the supervised machine learning method, which largely agreed with the previous mapping results by other researchers based on Clementine UV/VIS data [30,[48][49][50], proving that CELMS data are effective in mapping mare basalt units.

Conclusions
In this paper, a framework is proposed to systematically assess the brightness temperature features of Chang'e-2 CELMS in Mare Fecunditatis.As far as we know, this is the first time Chang'e-2 CELMS T B data have been systematically analyzed, and based on this, a new geological map of Mare Fecunditatis was generated by using machine learning techniques.The basalt map derived from CELMS data demonstrates many similarities with the previous geological map provided by Kramer et al., while providing some new hints for understanding the vertical structure of the lunar surface regolith.
This study quantitatively analyzed the effect of each brightness temperature feature in mapping the mare basin from various perspectives.In future studies, an analysis will be conducted on more lunar regions of research interest, including the landing site of Chang'e-5, where the surface regolith ground truth is obtained by in situ measurements and returned sample analysis [51].
Furthermore, machine learning methods have proved their great feasibility for CELMS data.However, up to now, studies on CELMS data still mainly relied on manual interpretation.Recently, machine learning models have been gradually introduced into lunar and planetary research [52][53][54].Further studies will explore the potential of more advanced machine learning models with regard to lunar brightness temperature analysis [55][56][57].
Additionally, fusing optical (such as WAC images and Clementine UV/VIS data) and microwave remote sensing for mapping the mare basin could be beneficial, as many studies on planetary surfaces mapping have proven that multisource remote sensing is helpful for improving classification accuracy [33,53,[58][59][60][61].

Figure 1 .
Figure 1.Location and WAC image of Mare Fecunditatis.(a) WAC image of nearside of the Moon, where the location Mare Fecunditatis is indicated by a green ellipse; (b) WAC image of Mare Fecunditatis.

Figure 2 .
Figure 2. T B and dT B feature maps of Mare Fecunditatis.(a) 3.0 GHz T B features; (b) 7.8 GHz T B features; (c) 19.35 GHz T B features; (d) 37.0 GHz T B features; (1) T B noon features; (2) T B midnight

Figure 7 .
Figure 7. Histograms of T B and dT B features for 5 types of sample units in Mare Fecunditatis.(a) 3.0 GHz T B features; (b) 7.8 GHz T B features; (c) 19.35 GHz T B features; (d) 37.0 GHz T B features; (1) T B noon features; (2) T B midnight features; (3) dT B features.

) ffi ffi Figure 8 .
Figure 8. Boxplot of T B and dT B features in 5 sample units of Mare Fecunditatis.(a) 3.0 GHz T B features; (b) 7.8 GHz T B features; (c) 19.35 GHz T B features; (d) 37.0 GHz T B features; (1) T B noon

Figure 9 .
Figure 9. Pearson coefficient of features.Generally, dT B features have a high positive correlation.T B noon features also have a high positive correlation, but it decreases as the frequency (corresponding with the penetration depth) increases.The correlation between dT B features and T B noon features increases with the frequency.Particularly, all 19.35 GHz and 37.0 GHz T B noon features have a high positive correlation with all dT B features.Features among T B midnight and between T Bmidnight and dT B show the tendency of transitioning from a positive correlation to uncorrelation and then to negative correlation.What is more, features with a closer frequency exhibit a higher correlation.Overall, a significant redundancy exists among these brightness temperature features.The class-by-class correlation coefficient is also analyzed and shown in Figure10.We found that an extremely high correlation exists among Iltm, Im and Cc, indicating that distinguishing these three basalt units in CELMS data would be challenging.In addition,
present the class-by-class contribution, and all-classes contribution results are illustrated in Figure 11.The results of ReliefF largely agree with the conclusion in Section 4.3, which proves that T B noon features are suitable for distinguishing classes of Ihtm-Chtr, Ihtm-Iltm, Ihtm-Cc, Ihtm-Im, Chtr-Cc, Chtr-Iltm and Chtr-Im compared with T B midnight features, and T B midnight features show their advantages in classifying Cc-Im, Cc-Iltm and Im-Iltm, while dT B features are appropriate for classifying most classes.Because the T B noon features present the superiority of distinguishing more classes, their all-classes contribution weights for T B noon features are higher than those for T B midnight features.The 19.35 GHz and 37.0 GHz dT B features show their superiority for distinguishing Cc-Iltm, which also accords with the conclusion drawn in Section 4.3.

Figure 11 .
Figure 11.Normalized ReliefF and OOB importance of all classes.

Figure 12 .
Figure 12.Random Forest Classification result for all original features.

Figure 13 .
Figure 13.Contribution rate of each PC.

Figure 15 .
Figure 15.Classification results obtained by PCA + RF with different numbers of PCs.(a) Two PCs; (b) Three PCs; (c) Four PCs; (d) Six PCs; (e) Eight PCs; (f) All 12 PCs. ff

Figure 16 .
Figure 16.Proportion of each geological unit in Mare Fecunditatis and its vicinity with different methods.

Figure 18 .
Figure 18.Classification with different acquisition time and dT B features.(a) T B noon features; (b) T B midnight features; (c) dT B features.

Author Contributions:
Conceptualization, Y.L., Z.M. and Y.Z.; methodology, Y.L. and Z.Y.; software, Z.Y.; validation, Y.L. and Z.Y.; formal analysis, Z.Y.; investigation, Z.Y. and Z.M.; resources, Y.L., Z.M., J.P. and Y.Z.; data curation, Z.M.; writing-original draft preparation, Y.L. and Z.Y.; writingreview and editing, Y.L. and Z.M.; visualization, Z.Y. and Z.M.; supervision, J.P. and Y.Z.; project administration, J.P.; funding acquisition, J.P. and Y.Z.All authors have read and agreed to the published version of the manuscript.Funding: This work was supported in part by the National Key Research and Development Program of China (2021YFA0715101), the Strategic Priority Program of the Chinese Academy of Sciences (XDB41020104) and the Natural Science Foundation of China (41706201).

Table 1 .
Normalized distance between mare basalt units.

Table 2 .
J-S divergence between mare basalt units.