A Global Gravity Reconstruction Method for Mercury Employing Deep Convolutional Neural Network

: Mercury, the enigmatic innermost planet in the solar system, is one of the most important targets of space exploration. High-quality gravity ﬁeld data are signiﬁcant to reﬁne the physical characterization of Mercury in planetary exploration missions. However, Mercury’s gravity model is limited by relatively low spatial resolution and stripe noises, preventing ﬁne-scale analysis and applications. By analyzing Mercury’s gravity data and topography data in the 2D spatial ﬁeld, we ﬁnd they have fairly high spatial structure similarity. Based on this, in this paper, a novel convolution neural network (CNN) approach is proposed to improve the quality of Mercury’s gravity ﬁeld data. CNN can extract the spatial structure features of gravity data and construct a nonlinear mapping between low- and high-degree data directly. From a low-degree gravity input, the corresponding initial high-degree result can be obtained. Meanwhile, the structure characteristics of the high-resolution digital elevation model (DEM) are extracted and fused to the initial data, to get the ﬁnal stripe-free result with improved resolution. Given the paucity of Mercury’s data, high-quality lunar datasets are employed as pretraining data after verifying the spatial similarity between gravity and terrain data of the Moon. The HgM007 gravity ﬁeld obtained by the MErcury Surface, Space ENvironment, GEochemistry and Ranging (MESSENGER) mission at Mercury is selected for experiments to test the ability of the proposed algorithm to remove the stripes caused by quality di ﬀ erences of the highly eccentric orbit data. Experimental results show that our network can directly obtain stripe-free and higher-degree data via inputting low-degree data and implicitly assuming a lunar-like relation between crustal density and porosity. Albeit the CNN-based method cannot be sensitive to subsurface features not present in the initial dataset, it still provides a new perspective for the gravity ﬁeld reﬁnement.


Mercury's Datasets
All the Mercury datasets used in this paper come from the MESSENGER mission, which was launched in August 2004 and entered a highly eccentric orbit, with periapsis, in the northern hemisphere, around Mercury, in 2011. The acquisition time of the datasets is from 2011 to 2015 [20,21].

Gravity Field Data of Mercury
During the MESSENGER mission, the degree of Mercury gravity field model was improved markedly: Smith et al. [22] released a 20-degree gravity field model, using the initial less-than-six-month orbit-tracking data. With the passage of time and the acquisition of new data, a 50-degree gravity field model [23] and a 100-degree gravity field model were upgraded subsequently [24].
However, the orbit of MESSENGER is fairly eccentric, with a periapsis located at high northern latitudes, which results in highly latitude-dependent data quality [23]. The degree strength of the data in Reference [24] reaches 100 in the northern hemisphere, but those in the southern hemisphere can only reach 40 degrees, while there are stripes in the gravity grid data in northern middle-latitude areas (in the red box of Figure 1) due to the incomplete coverage of the orbital data.

Mercury's Global DEM
The Mercury global DEM is obtained by using images collected by cameras onboard MESSENGER simultaneously from a least-squares bundle adjustment of common features in the Integrated Software for Imagers and Spectrometers (ISIS3), as shown in Figure 2b [25]. Since the topography is only available on mostly the northern hemisphere, the final DEM is the result of topography in the northern hemisphere and a co-registration of images from Mercury Dual Imaging System (MDIS) narrow-angle camera (NAC) and multispectral wide-angle camera (WAC) [26,27], and then inference of topography in the southern hemisphere. The spatial resolution of the global DEM raster data is 0.015625°/pixel. During the experiment, it is downsampled to the corresponding scale, as the high-degree gravity output.

Mercury's Global DEM
The Mercury global DEM is obtained by using images collected by cameras onboard MESSENGER simultaneously from a least-squares bundle adjustment of common features in the Integrated Software for Imagers and Spectrometers (ISIS3), as shown in Figure 2b [25]. Since the topography is only available on mostly the northern hemisphere, the final DEM is the result of topography in the northern hemisphere and a co-registration of images from Mercury Dual Imaging System (MDIS) narrow-angle camera (NAC) and multispectral wide-angle camera (WAC) [26,27], and then inference of topography in the southern hemisphere. The spatial resolution of the global DEM raster data is 0.015625 • /pixel. During the experiment, it is downsampled to the corresponding scale, as the high-degree gravity output.

Gravity Field Data of the Moon
The lunar gravity field model has also undergone development from a low degree to a high degree [28]. Prior to the Gravity Recovery and Interior Laboratory (GRAIL) mission [29][30][31], the model released by the SELenological and ENgineering Explorer (SELENE) was only up to 100 degrees [32]. The two probers of the GRAIL were launched in 2011, and each of them can send and receive signals with Earth or another probe. The gravity can be obtained by measuring the distance change between them. This novel measurement mode dramatically improves the resolution of the lunar gravity model up to 1500 degrees. The degree strength of the polar region is 890, and the far lunar surface central region reaches 680 [33,34] as shown in Figure 3a. The spherical harmonic expansion of the GRAIL gravity field model shows that stripe noises will appear in some regions when the resolution exceeds 1200 degrees. By comprehensive consideration, data below 800 degrees are utilized as input data without stripe noise.

Lunar Global DEM
The Global DEM product of the Moon (Figure 3b) is based on altimetry data acquired through mission phase Lunar Reconnaissance Orbiter (LRO)_ES_05 by the Lunar Orbiter Laser Altimeter (LOLA) [35,36]. The resolution is 256 pixels/degree [37]. Furthermore, it is also downsampled to the corresponding scale, as the high-degree gravity output.

Gravity Field Data of the Moon
The lunar gravity field model has also undergone development from a low degree to a high degree [28]. Prior to the Gravity Recovery and Interior Laboratory (GRAIL) mission [29][30][31], the model released by the SELenological and ENgineering Explorer (SELENE) was only up to 100 degrees [32]. The two probers of the GRAIL were launched in 2011, and each of them can send and receive signals with Earth or another probe. The gravity can be obtained by measuring the distance change between them. This novel measurement mode dramatically improves the resolution of the lunar gravity model up to 1500 degrees. The degree strength of the polar region is 890, and the far lunar surface central region reaches 680 [33,34] as shown in Figure 3a. The spherical harmonic expansion of the GRAIL gravity field model shows that stripe noises will appear in some regions when the resolution exceeds 1200 degrees. By comprehensive consideration, data below 800 degrees are utilized as input data without stripe noise.

Gravity Field Data of the Moon
The lunar gravity field model has also undergone development from a low degree to a high degree [28]. Prior to the Gravity Recovery and Interior Laboratory (GRAIL) mission [29][30][31], the model released by the SELenological and ENgineering Explorer (SELENE) was only up to 100 degrees [32]. The two probers of the GRAIL were launched in 2011, and each of them can send and receive signals with Earth or another probe. The gravity can be obtained by measuring the distance change between them. This novel measurement mode dramatically improves the resolution of the lunar gravity model up to 1500 degrees. The degree strength of the polar region is 890, and the far lunar surface central region reaches 680 [33,34] as shown in Figure 3a. The spherical harmonic expansion of the GRAIL gravity field model shows that stripe noises will appear in some regions when the resolution exceeds 1200 degrees. By comprehensive consideration, data below 800 degrees are utilized as input data without stripe noise.

Lunar Global DEM
The Global DEM product of the Moon (Figure 3b) is based on altimetry data acquired through mission phase Lunar Reconnaissance Orbiter (LRO)_ES_05 by the Lunar Orbiter Laser Altimeter (LOLA) [35,36]. The resolution is 256 pixels/degree [37]. Furthermore, it is also downsampled to the corresponding scale, as the high-degree gravity output.

Lunar Global DEM
The Global DEM product of the Moon (Figure 3b) is based on altimetry data acquired through mission phase Lunar Reconnaissance Orbiter (LRO)_ES_05 by the Lunar Orbiter Laser Altimeter (LOLA) [35,36]. The resolution is 256 pixels/degree [37]. Furthermore, it is also downsampled to the corresponding scale, as the high-degree gravity output.

Spatial Structure Similarity Analysis of Gravity Data and DEM
Mercury is relatively flat, because the difference of the topographic lowest and highest points is less than 10 km [38]. Likewise, the gravity anomalies of HgM007 in Figure 2a are in the range of −270 to 200 mGal, which is reasonably minor and has a similar distribution trend to the DEM. The lunar datasets show the same pattern; however, the free-air gravity anomalies provide information on the mass anomalies at different depths from the topographic relief to Core-Mantle Boundary (CMB). In other words, they are not necessarily related to the surface features; these two kinds of data do have 2D spatial internal structure similarity from the data-driven perspective.
To quantitatively assess the spatial structure similarity, the spatial size of the 100-degree grid data is set to 200 × 100, the size of the 50-degree grid data is 100 × 50, and so on. The corresponding DEM datasets are downsampled to the same scale as the gravity datasets, for quantitative comparison. The Pearson correlation coefficient (r) and the structural similarity index (SSIM) [39] are employed to quantitatively assess the spatial relationship of them: where n denotes the total number of the grid points; x g_i and x D_i are the elements of the gravity and DEM data, respectively; and x g_i and x D_i are the mean values. The larger the r is, the stronger the correlation will be.
where µ g and µ D stand for the mean values of the normalized gravity grid data and normalized DEM data, respectively; σ 2 g and σ 2 D are the variances; σ gD is the covariance; and C g and C D are constants that prevent the denominator from being 0. The closer SSIM is to 1, the more similar are the spatial structures.
It can be seen in Table 1 that the values of gravity and topography data are not strongly correlated with r just in the range of 0.47 to 0.66, but the SSIMs reach above 0.96. Consequently, as heterogeneous data, there is no numerical correlation between them; it is not applicable for direct linear regression. However, they have strong spatial structure similarity, which means the structure information of high-resolution DEM can be extracted to help reconstruct the gravity data in the spatial domain.

Methodology
The flowchart of the proposed method is shown in Figure 4. A nonlinear mapping between low-degree and coarse high-degree data is built. Simultaneously, the high-resolution topography information is employed for detail reconstruction, to get the final output. After training with a

Methodology
The flowchart of the proposed method is shown in Figure 4. A nonlinear mapping between low-degree and coarse high-degree data is built. Simultaneously, the high-resolution topography information is employed for detail reconstruction, to get the final output. After training with a converged loss, the learned network can be applied to Mercury gravity downscaling and destriping. Details of our algorithm are interpreted below. The entire structure of the proposed algorithm consisting of two parts is depicted in Figure 5: the gravity field reconstruction and the DEM refining. For maintaining the original information as much as possible, the pooling layer, which performs well in describing geo-based features, is not employed in the whole architecture. We instead make the network deeper, to enlarge the receptive field, via using more context and exploiting higher nonlinearities, to strengthen the predictive power. It learns a nonlinear end-to-end mapping between different degrees of Mercury's gravity field, which simultaneously employs the low-degree gravity field and the high-resolution DEM data. The entire structure of the proposed algorithm consisting of two parts is depicted in Figure 5: the gravity field reconstruction and the DEM refining. For maintaining the original information as much as possible, the pooling layer, which performs well in describing geo-based features, is not employed in the whole architecture. We instead make the network deeper, to enlarge the receptive field, via using more context and exploiting higher nonlinearities, to strengthen the predictive power. It learns a nonlinear end-to-end mapping between different degrees of Mercury's gravity field, which simultaneously employs the low-degree gravity field and the high-resolution DEM data.

Gravity Field Data Reconstruction Network
In this part, low-order degree data are inputted to get the preliminary high-degree result directly through the network, as shown in the upper part of Figure 5. This portion is based on the DenseNet [40], which can make full use of the hierarchical features from the original data, reducing the number of parameters substantially and stabilizing the training of deep network. It is composed of three modules: the feature extraction, the global feature fusion and the downscaling. One convolution layer 0 f , is firstly used to extract low-level features 0 F , from the low-degree input L G , Then D dense blocks are applied to learn the high-level features. The output Fd of the d h dense block can be obtained as follows: where d f denotes the operations of the d -th dense block. Specifically, in each dense block, the feature maps of all preceding layers are concatenated as inputs for each layer. Denote i d F as the output features of the i -th layer in the d -th dense block: where i d f is a composite function, including convolution and Rectified Linear Unit (ReLU), and refers to the concatenation of the feature maps produced by the convolutional layers The hyper-parameter k refers to the growth rate of the network.
The output of each dense block fully utilizes the corresponding convolutional layer within the block by short paths created between a layer and every other layer.
After extracting hierarchical features with a set of dense blocks, the feature fusion is further conducted in a global manner: where f f is a convolution, and f F denotes the fused features. The feature fusion can adaptively generate informative characteristics and improve computational efficiency.
Then one convolution layer for reconstruction followed by a deconvolution layer to get the high-degree gravity field data is used:

Gravity Field Data Reconstruction Network
In this part, low-order degree data are inputted to get the preliminary high-degree result directly through the network, as shown in the upper part of Figure 5. This portion is based on the DenseNet [40], which can make full use of the hierarchical features from the original data, reducing the number of parameters substantially and stabilizing the training of deep network. It is composed of three modules: the feature extraction, the global feature fusion and the downscaling. One convolution layer f 0 , is firstly used to extract low-level features F 0 , from the low-degree input G L , Then D dense blocks are applied to learn the high-level features. The output F d of the d h dense block can be obtained as follows: where f d denotes the operations of the d-th dense block. Specifically, in each dense block, the feature maps of all preceding layers are concatenated as inputs for each layer. Denote F i d as the output features of the i-th layer in the d-th dense block: where f i d is a composite function, including convolution and Rectified Linear Unit (ReLU), and [F 1 d , · · ·, F i−1 d ] refers to the concatenation of the feature maps produced by the convolutional layers 1, · · · , (i − 1) in the d h dense block. If each function F i d generates k feature maps, it follows that the i-th layer has k × (i − 1) input maps. The hyper-parameter k refers to the growth rate of the network.
The output of each dense block fully utilizes the corresponding convolutional layer within the block by short paths created between a layer and every other layer.
After extracting hierarchical features with a set of dense blocks, the feature fusion is further conducted in a global manner: where f f is a 1 × 1 convolution, and F f denotes the fused features. The feature fusion can adaptively generate informative characteristics and improve computational efficiency. Then one convolution layer for reconstruction followed by a deconvolution layer to get the high-degree gravity field data is used: where f ds is the operation of the downscaling unit, and G c H is the output of the first part, which is considered as a coarse high-degree gravity field.

DEM Refining Network
The spatial correlation between the DEM and the gravity field was proved in Section 2.3. Since the topography digitization project develops rapidly to achieve higher quality, the higher-resolution spatial structure of DEM can be integrated into the process of gravity field reconstruction to obtain a more-detailed high-precision gravity product.
Taking the spatial similarity between gravity field data and DEM into account, the high-resolution DEM is concatenated with the coarse output for further extracting informative features: where C is the concatenation operation, and F GD is the concatenated feature of coarse high-degree gravity field, G C H , and high-resolution DEM, D H . Since the coarse high-degree gravity-field data and the final output share the almost same distribution to a large extent, explicitly modeling the residual information by utilizing an identity skip-connection from the input to the end can achieve a faster convergence rate and superior performance [41]. Therefore, a three-layer residual network is employed as follows: where f 1 GD and f 2 GD are the first and the second convolution (including ReLU), respectively, to extra abundant informative characteristics that are already downscaled in the concatenated feature, F GD which are then leveraged by a convolution f GD to reconstruct the residual details. Ultimately, the residual information and the input coarse high-degree gravity field data are element-wisely added to get the final high-degree gravity field, G H .
In addition, the proposed network is optimized via minimizing the mean absolute error (MAE) between the downscaled output and the input low-degree gravity field data.

Experiments and Discussion
For the purpose of testing the feasibility and transferability of our method, two groups of experiments are designed for validation. Moreover, a fourfold downscaling is also conducted to explore the possibility to reach a higher resolution. Both the coarse and fine outputs are compared with the results of the bicubic interpolation and the original HgM007. In 2019, the HgM008 was proposed via improving the planet and spacecraft dynamical models [42]. Albeit the highest degree strength is still 100, the data quality has been enhanced. Therefore, the HgM008 grid data are compared with our results, to evaluate the destriping performance of our algorithm. Quantitative indices, visual effects and spectrum power are combined for analysis in detail.

Parameter Setting and Network Training
Given that the size of the input seriously affects the scale of the network, training data are prepared as 30 × 30 sub-data randomly cropped from the original training dataset. Except for the 1 × 1 convolutional layers in the global feature fusion, the kernel size of other convolutional layers is 3 × 3, to incorporate more nonlinear rectification layers and reduce the computational complexity [43]. Six dense blocks consisting of six convolution layers are applied, resulting in 36 convolution layers. A growth rate of 16 is set in each block, to achieve superior performance through feature reuse by less memory and computation [40]. To make sure that all feature maps are in the same size, the zero-padding is used before each convolution operation. The whole network is optimized via the Adam optimizer, with the default setting [44]. During the reconstruction, the mini-batch size and the weight decay are set to 32 and 10 −4 , respectively. The weight initialization is done by using the method in He et al. [45], and the bias is set as zero. Finally, the initial learning rate is initialized as 10 −4 and decreases by a half for every 100 epochs, until convergence. We implement the proposed network with the Tensorflow framework on an NVIDIA RTX 2080Ti GPU.

Quantitative Evaluation Indicators
Three indexes are selected. First, the Pearson correlation coefficient, denoted by r, is used to reflect the proximity of the estimated and the original data. Second, the SSIM is also used here to evaluate the spatial similarity, showing the degree of enhancement in the spatial domain. Finally, the root mean square error (RMSE) is adopted to assess the deviation: where (x, y) is the index of the grid data, and M and N denote the size of the grid data. Additionally, o x,y and e x,y denote the original value and the estimated value of the (x, y) element.

Compared Method
Since there have been few research studies on how to refine the gravity data in 2D spatial domain, the bicubic interpolation is selected as the main compared method, owing to its robust performance.
Bicubic interpolation is the most commonly used interpolation method in 2D space [46]. The interpolation P at the point (x, y) can be obtained by the weighted average of the 16 nearest points in the rectangular grid. Two polynomial cubic interpolation functions are required in horizontal and vertical direction, respectively. For 2D grid data, the interpolation function can be expressed as follows: where W is the weight function: and a is taken as −0.5 in our experiments.

Experiment 1: The Feasibility of the Proposed Network
In order to fully learn the characteristics of gravity data at various scales (obtained by the GRAIL, which is converted from spherical harmonic coefficients), five kinds of total 20,000 pairs (80% of the lunar dataset) are chosen for pretraining: 25-50, 50-100, 100-200, 200-400, and 400-800 degrees. The 20% left area of 25-50, 50-100 and 400-800 not used for training is selected as test data. Meanwhile, DEMs are downsampled to the needed scale.
First, we test the network on lunar data, and outputs are compared with the corresponding data of the original GRAIL data. From the results in Table 2, we can see that all the indicators have improved in the proposed network, compared with the bicubic interpolation, and have achieved fairly good values, which indicates that our method is feasible. Additionally, with the increase in the degree of the data, more geographical details would appear. The loss of details leads to a general decline in numerical indicators, but the advantages of our approach are reflected. It is difficult to reconstruct smaller-scale details for the bicubic interpolation, while the purposed method can restore them via the injection of high-resolution topographical spatial structure. Second, in order to assess the generalization performance of the proposed method, the less precise 100-degree SELENE [32] and the corresponding DEM are adopted as the test data. By comparing the reconstructed SELENE with the more precise GRAIL data, we can avoid the problem of overfitting that may occur when only using GRAIL data for testing.
The quantitative indicators of test on SELENE are presented in Table 3, and the visual results are shown in Figure 6. It can be observed in Figure 6a,b that SELENE and GRAIL have many differences, especially in the polar regions, because of different tracking techniques. After reconstructing the 100-degree SELENE to 200 degree, the correlation between SELENE and GRAIL increases, while the difference decreases, as listed in Table 3. From Figure 6c,d, we can find that more details are generated in the 200-degree SELENE, and the errors in polar regions are reduced. Therefore, our method has stable generalization, which benefits the transferring to the data of Mercury.

Experiment 2: The Transferability between the Lunar and the Mercury Data
During Experiment 2, different scales of lunar data in Experiment 1 are still used for training, while the test sets are 25-and 50-degree Mercury data. Since the initial parameters inside our network are obtained based on the lunar training data, a small number of 25-50-degree pairs are adopted for training, to fine-tune [47] the parameters so that the network can be more applicable to Mercury's data. Then the network is tested with the 50-100-degree pairs again.
First, the 25-50-degree reconstructed result without fine-tuning, in Table 4, is analyzed. It can be found that our method is still beyond the bicubic interpolation, although the network is trained on the lunar data. Then, in terms of Mercury's 50-100-degree reconstructed results, all the indices decrease distinctly, compared to that of the 25-50-degree results. For this phenomenon, we hold the view that there are a lot of stripe noises on the mid-latitude region of the original 100-degree data, but our estimated 100-degree data are free of stripe noises. Therefore, it is worth noting that the gravity data reconstructed by our algorithm overcome the stripe phenomenon.

Experiment 2: The Transferability between the Lunar and the Mercury Data
During Experiment 2, different scales of lunar data in Experiment 1 are still used for training, while the test sets are 25-and 50-degree Mercury data. Since the initial parameters inside our network are obtained based on the lunar training data, a small number of 25-50-degree pairs are adopted for training, to fine-tune [47] the parameters so that the network can be more applicable to Mercury's data. Then the network is tested with the 50-100-degree pairs again.
First, the 25-50-degree reconstructed result without fine-tuning, in Table 4, is analyzed. It can be found that our method is still beyond the bicubic interpolation, although the network is trained on the lunar data. Then, in terms of Mercury's 50-100-degree reconstructed results, all the indices decrease distinctly, compared to that of the 25-50-degree results. For this phenomenon, we hold the view that there are a lot of stripe noises on the mid-latitude region of the original 100-degree data, but our estimated 100-degree data are free of stripe noises. Therefore, it is worth noting that the gravity data reconstructed by our algorithm overcome the stripe phenomenon.
As for the result of Mercury's 50-100-degree data after fine-tuning with the Mercury 25-50-degree training pairs, there are minor improvements in the indicators. Because of the low resolution, the samples used for fine-tuning are overwhelmingly limited, but the visual details are still enhanced, as shown in Figure 7, especially for the southern hemispherical data, which only reach 40 degree. To sum up, the transferability is verified via Experiment 2, and it also demonstrates that adding DEM information can help in recovering the low-quality gravity data.  As for the result of Mercury's 50-100-degree data after fine-tuning with the Mercury 25-50-degree training pairs, there are minor improvements in the indicators. Because of the low resolution, the samples used for fine-tuning are overwhelmingly limited, but the visual details are still enhanced, as shown in Figure 7, especially for the southern hemispherical data, which only reach 40 degree. To sum up, the transferability is verified via Experiment 2, and it also demonstrates that adding DEM information can help in recovering the low-quality gravity data. Compared with the HgM008 grid data, the 100-degree fine-tuned result still obtains relatively good indicators, but it is lower than the others. The decline could be mainly caused by the degree strength difference of the southern hemisphere between the HgM008 and our results. Therefore, in order to better evaluate the destriping performance, the region of 0 to 45 degrees northern latitude is selected to be quantitatively compared. As shown in Table 5, the 100-degree fine-tuned result and the HgM008 have closer distribution, proving that the proposed network can improve the gravity grid data quality to some extent. In order to further explore the improving capacity of the proposed method when the resolution difference becomes larger, 25-100-degree, 100-400-degree and 200-800-degree lunar gravity datasets are used as the third training sets for downscaling at four times. All the training sets are divided into datacubes, and 80% of them are selected randomly for training. The 50-200-degree pairs which are not involved in training are adopted for testing. Although the 100-degree Mercury data are contaminated by stripes, for better transferring to Mercury gravity model, a small amount of 25-100 Mercury data at 60 to 90 degree north latitude is selected for fine-tuning, to produce the 200-degree product.
Since the original HgM007 can only reach 100 degree at the highest, the direct assessment cannot be done for our 200-degree product. In order to prove that the texture presented in Figure 8c  Compared with the HgM008 grid data, the 100-degree fine-tuned result still obtains relatively good indicators, but it is lower than the others. The decline could be mainly caused by the degree strength difference of the southern hemisphere between the HgM008 and our results. Therefore, in order to better evaluate the destriping performance, the region of 0 to 45 degrees northern latitude is selected to be quantitatively compared. As shown in Table 5, the 100-degree fine-tuned result and the HgM008 have closer distribution, proving that the proposed network can improve the gravity grid data quality to some extent.

Experiment 3: The Fourfold Downscaling Accuracy
In order to further explore the improving capacity of the proposed method when the resolution difference becomes larger, 25-100-degree, 100-400-degree and 200-800-degree lunar gravity datasets are used as the third training sets for downscaling at four times. All the training sets are divided into datacubes, and 80% of them are selected randomly for training. The 50-200-degree pairs which are not involved in training are adopted for testing. Although the 100-degree Mercury data are contaminated by stripes, for better transferring to Mercury gravity model, a small amount of 25-100 Mercury data at 60 to 90 degree north latitude is selected for fine-tuning, to produce the 200-degree product.
Since the original HgM007 can only reach 100 degree at the highest, the direct assessment cannot be done for our 200-degree product. In order to prove that the texture presented in Figure 8c is not artificial, the lunar 50-200-degree dataset is used for testing and analogy analysis; the visual results are presented in Figure 9, and the indicators are shown in Table 6.
When performing four-time downscaling, the detailed difference between low-and high-degree datasets becomes greater, and the high-resolution spatial structure of topography provides larger weights during training, offering a criterion for reconstruction. Therefore, comparing the bicubic result and coarse output in Figure 8, the fine output has more small-scale textures close to the original data, and the indicators are also more prominent. Via the experiments of the lunar 50-200-degree datasets, it can be inferred that the subtle texture on our 200-degree product is reliable and closer to the ground truth.
is not artificial, the lunar 50-200-degree dataset is used for testing and analogy analysis; the visual results are presented in Figure 9, and the indicators are shown in Table 6. When performing four-time downscaling, the detailed difference between low-and high-degree datasets becomes greater, and the high-resolution spatial structure of topography provides larger weights during training, offering a criterion for reconstruction. Therefore, comparing the bicubic result and coarse output in Figure 8, the fine output has more small-scale textures close to the original data, and the indicators are also more prominent. Via the experiments of the lunar 50-200-degree datasets, it can be inferred that the subtle texture on our 200-degree product is reliable and closer to the ground truth.

Power Spectrum Analysis and Discussion
It can be found via experimental results that our output and HgM007 are consistent in the overall distribution, while the proposed algorithm with higher resolution and better spatial quality can reflect more smaller-scale details, such as small-diameter craters.
To further verify the dependability of our method, the power spectrum of 50-, 100-and 200-degree results, and 50-and 100-degree products without the injection of DEM are analyzed as the indirect proof. Their 2D gravity anomalies grid data are converted to a corresponding spherical harmonic coefficient, employing Wieczorek's SHTOOLS software [48]. According to Mazarico et al. [49], the Kaula coefficient is taken as × -5 3 10 . The power spectrum reflects the gravity signal intensity and information richness [50]. The larger it is, the better the signal and the more information the gravity model will contain.
As shown in Figure 10, the general trends of our results, the HgM007 and HgM008 are consistent and fairly close to the Kaula constraint, but they are higher than it at degrees lower than 25. Moreover, the 100-degree coarse output has a noticeable drop at 50 degree, while the spectrum power curve of the fine output is smooth, which can further validate that the spatial information of DEM can help to better reconstruct the gravity field. Albeit our 100-degree result is slightly below the HgM007 and HgM008 in Figure 10b, which could be caused by the lack of contributions from subsurface mass anomaly features without topographic expression. The 200-degree result has the largest power spectrum until 120 degree, in Figure 10c. We think it agrees with the conclusion of Experiment 3: When the downscaling scale gets larger, the weights of topographical structures in the network become greater, which brings more finer spatial information into the final result.

Power Spectrum Analysis and Discussion
It can be found via experimental results that our output and HgM007 are consistent in the overall distribution, while the proposed algorithm with higher resolution and better spatial quality can reflect more smaller-scale details, such as small-diameter craters.
To further verify the dependability of our method, the power spectrum of 50-, 100-and 200-degree results, and 50-and 100-degree products without the injection of DEM are analyzed as the indirect proof. Their 2D gravity anomalies grid data are converted to a corresponding spherical harmonic coefficient, employing Wieczorek's SHTOOLS software [48]. According to Mazarico et al. [49], the Kaula coefficient is taken as 3 × 10 −5 . The power spectrum reflects the gravity signal intensity and information richness [50]. The larger it is, the better the signal and the more information the gravity model will contain.
As shown in Figure 10, the general trends of our results, the HgM007 and HgM008 are consistent and fairly close to the Kaula constraint, but they are higher than it at degrees lower than 25. Moreover, the 100-degree coarse output has a noticeable drop at 50 degree, while the spectrum power curve of the fine output is smooth, which can further validate that the spatial information of DEM can help to better reconstruct the gravity field. Albeit our 100-degree result is slightly below the HgM007 and HgM008 in Figure 10b, which could be caused by the lack of contributions from subsurface mass anomaly features without topographic expression. The 200-degree result has the largest power spectrum until 120 degree, in Figure 10c. We think it agrees with the conclusion of Experiment 3: When the downscaling scale gets larger, the weights of topographical structures in the network become greater, which brings more finer spatial information into the final result.

Conclusions
In this paper, a novel convolutional-neural-network-based gravity-data downscaling and destriping algorithm is presented. This deep learning approach is applied to planetary gravity field data reconstruction for the first time, whereby input low-degree data can be reconstructed into higher-degree data through our network. First, the spatial structure similarity between DEM and gravity data is explored, which supports the idea that high-resolution topographic information can help the reconstruction of gravity data. Then, to tackle the lack of Mercury's training samples, high-quality lunar samples are adopted for pretraining, based on the similarity between these two bodies. Three sets of experiments and power spectrum analysis elaborate the feasibility and reliability of the proposed network.
Compared with the HgM007, our estimation has a consistent distribution trend and shows more details. It can be used for the orbit determination of future missions, such as the BepiColombo, and the landing site selection. However, our method is data-driven and implicitly assumes a relation between gravity and topography at high degrees similar to the Moon. With this caveat in mind, it can still be used to provide a reference for fine-scale analysis and interpretation.
In the future, we plan to combine the data-driven approach with geophysical-constrained models, thereby exploring a more accurate and physically sound result.
Author Contributions: All the authors have contributed to this manuscript. S.Z. and D.L. designed the algorithm, did the experiments and finished the initial manuscript. Q.Y. and J.L. supervised and provided resources for this work and modified this manuscript. All authors have read and agreed to the published version of the manuscript.

Conclusions
In this paper, a novel convolutional-neural-network-based gravity-data downscaling and destriping algorithm is presented. This deep learning approach is applied to planetary gravity field data reconstruction for the first time, whereby input low-degree data can be reconstructed into higher-degree data through our network. First, the spatial structure similarity between DEM and gravity data is explored, which supports the idea that high-resolution topographic information can help the reconstruction of gravity data. Then, to tackle the lack of Mercury's training samples, high-quality lunar samples are adopted for pretraining, based on the similarity between these two bodies. Three sets of experiments and power spectrum analysis elaborate the feasibility and reliability of the proposed network.
Compared with the HgM007, our estimation has a consistent distribution trend and shows more details. It can be used for the orbit determination of future missions, such as the BepiColombo, and the landing site selection. However, our method is data-driven and implicitly assumes a relation between gravity and topography at high degrees similar to the Moon. With this caveat in mind, it can still be used to provide a reference for fine-scale analysis and interpretation.
In the future, we plan to combine the data-driven approach with geophysical-constrained models, thereby exploring a more accurate and physically sound result.