The Inﬂuence of Dynamic Solar Oblateness on Tracking Data Analysis from Past and Future Mercury Missions

: When the BepiColombo spacecraft arrives at Mercury in late 2025, it will be able to measure the orbit of the planet with unprecedented accuracy, allowing for more accurate measurements of the perihelion advance of the planet, as predicted by the Theory of General Relativity (GR). A similar effect is produced by the gravitational oblateness of the Sun through the zonal coefﬁcient J 2 (cid:12) . The gravitational ﬁeld of the Sun has been hard to determine despite centuries of observations, causing great uncertainties in experiments on GR. Recent publications in heliophysics suggest that J 2 (cid:12) is not a constant, but a dynamic value that varies with solar magnetic activity. The aim of this paper is to analyse what the effect is of suggested higher-order effects of the solar gravitational ﬁeld on experiments of the perihelion advance of Mercury as predicted by GR. The orbit of Mercury and observations of the MESSENGER and BepiColombo spacecraft are simulated, and parameters corresponding to gravitational theory, as well as the oblateness J 2 (cid:12) including a time-variable component are estimated using a least-squares approach. The result of the estimation is that the amplitude of a periodic component can be found with an uncertainty of 3.7 × 10 − 11 , equal to 0.017% the value of J 2 (cid:12) . From analysis of published experiments that used MESSENGER tracking data, it can already be deduced that the amplitude of the periodic variation cannot be higher than 5% of the value of J 2 (cid:12) . It is also found that if a periodic component exists with an amplitude greater than 0.04% the value of J 2 (cid:12) and it is not considered, it can lead to errors in the experiments of GR using BepiColombo data to the point that results falsely conﬁrm or contradict the Theory of General Relativity.


Introduction
The solar system has always been a suitable laboratory for testing gravitational theory. An often-used test is the secular precession of Mercury's perihelion, for which it was discovered in the 19th century that it could not be explained just by using Newtonian gravity and third-body interactions of other planets [1]. Albert Einstein used the observed precession of Mercury's perihelion as a key test to confirm his Theory of General Relativity (GR, [2]). Ever since, experiments on gravitational interactions have only confirmed General Relativity with increasing accuracy. GR is, however, not able to provide all the answers about gravitational interactions yet in fundamental physics, especially on the scales of quantum mechanics or cosmology [3][4][5]. Therefore, more than one hundred years after the theory was introduced, testing GR and its underlying principles is still a hot topic. Finding correctness with increasing precision or microscopic deviations of GR is of high relevance in the search for a universal theory in fundamental physics. The current state, future outlook and relevance of experiments on GR is provided in great detail by [5]. In this paper, the relativistic influence on the secular precession of Mercury's orbit will be the focus.
The orbit of Mercury is a test subject for such an experiment. The first-order post-Newtonian perturbation is the perihelion advance, which can be expressed as a function of the PPN parameters. Per orbit of Mercury around the Sun, the precession is equal to ∆ω (Equation (66) of [5]): where m and m M are the masses of the Sun and Mercury and p is the semilatus rectum of the orbit of Mercury around the Sun. With accurate observations of the orbit of Mercury the values of the PPN parameters can be constrained. However, the second-order zonal effect of the Sun, caused by the mass bulge at the equator, produces a similar effect which has to be considered (Equation (3) of [7]): J 2 is the normalised zonal coefficient of degree 2 of the Sun, R is the mean radius of the Sun and i is the inclination of the orbit of Mercury with respect to the solar equator. A major source of uncertainty in experiments of the PPN parameters is caused by the uncertainty of coefficient J 2 . Attempts to determine the shape of the Sun date back to the 19th century. The determination of the visual oblateness (the visual difference in radius between equator and poles) has turned out to be challenging, because among other reasons, the interior (i.e., mass distribution) is unknown, the surface rotation is not the same for all latitudes and the visual shape of the Sun is hard to observe due to magnetic surface activity. A comprehensive overview of experiments and their challenges is provided by [8], see also discussions by [9,10]. Zonal coefficient J 2 , which influences bodies orbiting the Sun, appears to not relate directly to its visual shape and its rotation rate [8,11], and therefore, the field of heliophysics has faced difficulty in coming up with constraints on J 2 that are useful as input for tests on gravitational physics. It is for that reason that in experiments that use planetary orbits, parameter J 2 is usually estimated alongside the PPN parameters. This limits the precision of the experiment, as γ, β and J 2 are all linearly proportional to the precession rate of Mercury's perihelion. γ can be distinguished through other experiments due to its effect on the propagation of light (see [12]), but for β or J 2 , no other experiment is available. This causes a high correlation between β and J 2 in their estimation. As a result, unmodelled variability of J 2 may pollute estimates of β and other parameters for testing GR. In light of the numerous near-future missions equipped with high-accuracy tracking systems which will contribute to improving solar system tests of relativity [13], analysing the influence of J 2 is highly pertinent, and it is the focus of this manuscript.
To illustrate the challenge of determining J 2 , Figure 1 shows selected attempts to determine the value of J 2 both by estimation using planetary orbits and by the field of heliophysics. It can be observed that uncertainties are often quite optimistic and results are not consistent with each other, presumably due to the correlation of J 2 and other parameters, or other assumptions that are made in the determination of the parameter and its uncertainty. Error bars indicate the 2σ uncertainty, corresponding to a 95% confidence level. Where no error bars are drawn, uncertainties were not provided. All entries in this plot are provided at the end of the paper in Table A1. To complicate matters further, it is suggested that higher-order zonal effects such as J 4 could also be of significant value to affect planetary orbits at a measurable level [8,14]. In addition, publications in heliophysics state that the visual solar oblateness varies along the 11-year solar activity cycle, and a similar periodic variation in the value of J 2 is suggested [15,16], although this theory has also been contested [17]. The effect that a dynamic value of J 2 would have on the planetary orbits has been calculated by [18] and the importance of considering it in gravitational experiments has been stressed by [19], in both cases, it is concluded that the dynamic effect can have a relevant influence on tests of GR. Nevertheless, a dynamic oblateness has so far never been considered in publications about solar system ephemerides or experiments of gravitational physics.
The aim of this study is to bridge the gap between heliophysics and gravitational physics, by investigating the different configurations that are suggested for the solar gravity field. The central question of this study is: with the observational capabilities of Mercury missions that are expected in the foreseeable future, can the various hypotheses about the solar gravity field be confirmed or rejected, and do they have a relevant influence on tests of GR? If certain hypotheses can be confirmed or rejected, it is of great value to both fields to refine their theoretical models further.
To date, the best experiments on the orbit of Mercury use data from the MESSENGER spacecraft by NASA which orbited Mercury from 2011 until 2015, see e.g., [7,20,21]. The best next opportunity will be the BepiColombo mission by ESA and JAXA, which is currently in cruise phase and will be inserted into Mercury orbit in 2025 for a nominal mission of one year. It will provide tracking data with unprecedented accuracy with its dedicated experiment, the Mercury Orbiter Radioscience Experiment (MORE, [22]), due to its dualfrequency tracking in the X and Ka bands. Simulations of relativity experiments using simulated BepiColombo tracking data show improvements of 1 to 2 orders of magnitude for constraints on β, e.g., [23][24][25][26][27][28][29][30]. Neither the experiments with MESSENGER data, nor the simulations of experiments with BepiColombo data have considered perturbations due to higher-order effects due to the solar shape besides a constant J 2 . In this work, a similar simulated experiment will be set up which will consider such higher-order effects. Tracking data of both MESSENGER and BepiColombo will be simulated, resulting in one combined data set which will span 20 years, allowing to track the effects of small perturbations on the orbit of Mercury over this duration.

Method
In this section, we describe the setup of our numerical experiments. First, we create a synthetic truth model by integrating the orbit of Mercury from 2008 to 2028, with the acceleration models described in Section 2.1 and all parameters set according to values consistent with GR. Using this numerically integrated state of Mercury, range observations are simulated according to the methods described in Section 2.2. A realistic error on the simulated observations is generated based on a simulation of the precise orbit determination of Mercury orbiters. The observations are used as input to estimate a set of parameters in a least-squares algorithm, which will be explained in Section 2.3. The software used for simulation and estimation is the TU Delft Astrodynamics Toolbox (Tudat, [31]).

Accelerations Acting on Mercury
For the numerical integration of Mercury's orbit, the dynamical model takes into account relevant perturbations that can produce a measurable effect in the MESSENGER and BepiColombo tracking data. These perturbations are: Deviations from GR, i.e., effects predicted by alternative theories of gravity that could have a measurable impact on the orbit of Mercury.
The point mass accelerations are included of all relevant solar system bodies which are the Sun, the 7 planets and the Moon. In addition, the 300 most influential asteroids are included. This selection of asteroids is often included in ephemerides due to their significant gravitational contribution, see, e.g., [32,33]. For smaller asteroids, as well as Trans-Neptunian Objects (TNOs) and other minor solar system bodies, the contribution is observed to be negligible in this study. The positions of all bodies are obtained from ephemeris DE430 [33] using SPICE kernels as the data source [34,35].
For the accelerations due to figure effects of the Sun, the largest effect is caused by zonal spherical harmonic coefficient J 2 , of which the acceleration can be expressed as [36]: whereP n0 is the normalised Legendre polynomial at degree n and order 0, φ is the longitude and λ is the latitude of Mercury with respect to the body fixed reference frame of the Sun, i.e., the frame that rotates with the 'surface' of the Sun, which is called IAU_SUN in SPICE [34,35]). the coefficient J 2 is in the order 10 −7 , see Figure 1 and Table A1. We model a time-varying J 2 by adding a time-variable correction to the mean value of J 2 . The sunspot number is commonly accepted as the measure for solar magnetic activity. Therefore, under the assumption that the variations in oblateness mainly depend on solar magnetic activity, as also shown in [16], a sinusoidal variation is a reasonable first-order correction. The time-variability of J 2 is therefore modelled as: Here, A J 2 is the amplitude of the periodic variation and will be included in the parameter estimation (see Section 2.3). The period P and phase ϕ are chosen such that the correction aligns with the solar activity cycle. In Figure 2, the monthly smoothed sunspot number is plotted, and the mission durations of MESSENGER and BepiColombo (nominal mission) are indicated on the horizontal axis. A period of 11 years is chosen with a phase tuned such that the correction c J 2 (t) reaches a minimum in December 2008 and December 2019. The next zonal coefficient J 4 provides the next largest perturbation due to figure effects, of which the acceleration can be expressed as [36]: For J 4 , values in the range from 10 −9 to 10 −7 are suggested from heliophysical analyses [8]. Even though suggested values for J 4 can become as high as the values of J 2 , the perturbation is much smaller as it drops off much more quickly with distance from the Sun (see Equation (5) in comparison to Equation (3)). If both zonal coefficients are in the order of magnitude 10 −7 , the perturbation of J 4 results in an acceleration acting on Mercury in the order of 10 −15 m/s 2 , whereas J 2 produces an acceleration in the order of 10 −12 m/s 2 . The maximum value of J 4 suggested in literature is 6.3 × 10 −7 [39], which causes a change in orbit after 20 years of a few meters, equal to the 1σ noise of MESSENGER range observations. The effect that the perturbation produces on the orbit of Mercury is equal to or smaller than the state determination uncertainty for realistic values of J 4 . In several runs in this study, J 4 was included in the parameter estimation, but the zonal coefficient could not be estimated with any relevant uncertainty. Omitting the J 4 effect did not impact the estimation of the other parameters. Therefore, it can be safely neglected in this experiment. For the gravitational force exerted by the Sun, the post-Newtonian relativistic acceleration is implemented according to Equation 7.42 of [11]. To simplify this equation, the assumption is made that the mass of Mercury m M is negligibly small with respect to the mass of the Sun (m + m M ≈ m , which is only off by a relative factor in the order of 10 −7 , an immeasurable amount in this experiment). Furthermore, we only consider gravitational theories that comply with conservation laws for total momentum, meaning that PPN parameters α 3 , ζ 1 , ζ 2 , ζ 3 and ζ 4 are all zero. The resulting acceleration term exerted on the Sun by Mercury is then: where r and v are the relative position and velocity vectors of Mercury with respect to the Sun, G is the universal gravitational constant and c is the speed of light in vacuum. Furthermore, the angular momentum of the Sun S produces the perturbation known as the Lense-Thirring effect, which is included as follows [40]: In addition, possible violations of the Strong Equivalence Principle (SEP) are considered in the dynamical model. The SEP, a cornerstone of GR, states that the inertial and gravitational masses of a body are considered to be equal in any experiment in a gravity field, e.g., planets in the gravity field of the Sun, and that the self-gravitational energy Ω i of the bodies themselves only play a role on the same footing as the other energy forms [41]. To quantify the influence of possible violations of the SEP, the Nordtvedt parameter η is used, which can be expressed as a function of PPN parameters (see Section 8.1 of [11]): where it is again assumed that α 3 , ζ 1 , ζ 2 , ζ 3 , ζ 4 = 0. If the SEP is violated, η has a nonzero value. It was determined by [21,25] that for Mercury the dominant perturbation induced by a SEP violation, is equivalent to a change of the effective location of the Solar System Barycenter (SSB). This means that the the position of the Sun with respect to the SSB has to be redefined: The difference in SSB is at maximum in the order of decimetres for η in the order of 10 −5 (the current upper constraint from experiments, see Table 1), and the only measurable effect is a slight difference in the point-mass acceleration exerted on Mercury by the Sun. Finally, the variation of the solar gravitational parameter µ = Gm is considered. GR predicts that the gravitational constant G is constant across space and time, but other gravitational theories have suggested that G may vary with the evolution of the universe [5,42]. In addition, the solar mass m changes over time due to the radiative output of the Sun and solar wind. The combination of the variation of these two parameters is expressed as: In this study, a linear time variationĠm /Gm is assumed as a first-order approximation to constrain the effect of a changing solar gravitational parameter.
The final equations of motion for acceleration a M acting on Mercury is a sum of the defined accelerations in this section: where a CG,i is the central gravity (point mass) acceleration for body i which is taken for all n relevant bodies, the other terms apply for the Sun only. Equations (9) and (10) affect the calculation of the solar acceleration terms.

Simulated Observations
To perform the estimation of parameters of interest (see Section 2.3), observations of the dynamics of planet Mercury over time are required. During the missions, these observations are realised by spacecraft tracking data, which provide two-way range and Doppler measurements between ground stations on Earth and the spacecraft. In this study, range and Doppler observations are both considered, but they are incorporated into the estimation in a decoupled manner. Range observations are sensitive to longer-period signals in the dynamics, such as those from planetary ephemerides. Doppler data, on the other hand, are best suited to measuring shorter period variations, such as the spacecraft orbit and planetary gravity field coefficients. The characteristic period of a signal for which range data will be better suited than Doppler is on the order of several days and higher [43]. Taking this into account, we perform the following, decoupled analyses:

1.
Simulated multi-arc estimation of the spacecraft orbit around Mercury using only Doppler data, with characteristic signatures in the dynamics on the order of hours; 2.
Mercury ephemeris estimation using only range data, with characteristic signatures on the order of months.
This second step constitutes our simulated relativity experiment. The first step allows us to generate spacecraft orbit uncertainties which we use as input to our measurement error budget for our second step. To generate observations for our simulated ephemeris estimation, we recenter the original range observation between Earth and the spacecraft, denoted d E−S/C , making it a range observation between Earth and Mercury's centre of mass, denoted d E−M .
The error level of d E−M (the simulated observation of Mercury with respect to Earth) is a combination of the direct uncertainty in the range observation of the spacecraft, and the spacecraft orbit uncertainty w.r.t. Mercury, projected along the Earth-spacecraft unit vector u E−S/C [44]: where σ r S/C−M is the time-varying orbit uncertainty of the spacecraft with respect to Mercury, obtained from simulated Doppler data analysis. The errors σ d E−M of the simulated Earth-Mercury range observations are an important input for our numerical experiments, as they in part drive the uncertainty at which parameters can be estimated. For the two-way range observations from Earth to the spacecraft (d E−S/C ), the following inputs were taken. For MESSENGER, the range precision depends on the Sun-Probe-Earth (SPE) angle. The errors increase considerably at lower angles due to interference caused by solar plasma [21]. A two-way noise level of 0.5 to 3.0 m is used, depending linearly on the SPE angle with minimum noise at 180 degrees and maximum noise at 35 degrees. Observations are not simulated when the SPE angle is below 35 degrees, as their errors are too high to be of use in this experiment. BepiColombo range data do not suffer from this solar plasma effect due to its multi-frequency radio links and, therefore, the noise level is constant and data at any SPE angle can be used [27], even at extremely low SPE angles there is no problem, providing the capability to conduct superior solar conjunction experiments with the data of BepiColombo [45]. The target two-way noise level for BepiColombo is 20 to 30 cm [46], first reports from the cruise phase of BepiColombo even give centimetre-level errors [47]. A conservative range noise level of 30 cm was chosen for range simulations in this work.
The distance from the spacecraft to Mercury and its projected error u E−S/C · σ r S/C−M are determined through Doppler-based orbit determination of the spacecraft with respect to Mercury. To incorporate this into the simulation, we simulated Precise Orbit Determination (POD) for the MESSENGER and BepiColombo spacecraft, using tracking data schedules and uncertainties similar to those in [21,48] for MESSENGER and [23,49] for BepiColombo. The purpose of this part of our work was to determine how spacecraft position errors vary along the mission duration, which we require as input to Equation (12).
It was found that the position errors of the spacecraft with respect to Mercury are largely based on two factors: the geometry between the spacecraft and Earth in the solar system (which varies over months) and the true anomaly of the spacecraft in its orbit around Mercury (which varies over minutes). Using simulated POD of both spacecraft from their entire mission, a database of orbit geometries and subsequent orbit determination error levels was generated as input for our Mercury ephemeris simulation experiment. During the generation of observations in the experiment, orbit determination errors σ r S/C−M are calculated through interpolation of this benchmark database, using as independent variables the true anomaly of the spacecraft and the orbit geometry between the Sun, Mercury and Earth.
The results for the obtained orbit determination error levels match with error levels reported for MESSENGER (e.g., [20,21] and error levels expected for BepiColombo [50]. The validation of the complete simulation setup in Section 2.4 indicates that these simulated errors are an accurate representation of the real situation, since past experiments could be reliably reproduced.

Parameter Estimation
The goal of the estimation is to determine the accuracy with which the initial state of Mercury and parameters γ, β, η,Ġm /Gm , J 2 and A J 2 can be constrained using the observations. The estimation is performed using a batch non-linear least-squares algorithm. There are two different numerical integrations which are performed in one run of this simulation experiment:

1.
A true orbit has to be generated. A numerical integration of the orbit of Mercury is performed using as input a true initial state of Mercury and set of parameters. This true orbit is used to calculate the observations using the method described in Section 2.2.

2.
The initial state and parameters are perturbed, and a model orbit is generated through numerical integration. With the modelled orbit, model observations are calculated.
The difference between the model observations and the true observations is what is minimised in the least-squares estimation, after which the model orbit can be generated again and this process repeats iteratively until the parameter estimation converges.
The least-squares algorithm calculates a correction on the perturbed initial state and parameters according to the following equation: where P is the covariance matrix: ∆x lsq 0 is the update on the parameter vector x which includes the initial state of Mercury and the estimated parameters, P 0 is the a priori covariance matrix, H x is the design matrix with partial derivatives of the observations with respect to the parameters and ∆z is the difference between the true observations and modelled observations. W is the observation weight matrix, which represents the measurement error covariance. To fill W, we assume uncorrelated measurement errors. Having obtained the typical error levels in Section 2.2, a random Gaussian error sample is generated and added to the state of the spacecraft, which will result in an error in the range observation. The tracking schedules (frequency of observations) are taken similarly to [48] for MESSENGER and [49] for BepiColombo. For both missions, one range observation is simulated as well for each Mercury flyby that was performed during the cruise phase, as one tracking arc is considered during the orbit determination of a flyby [32,51].

Least-Squares Error Analysis
The square root of the diagonal entries of covariance matrix P (Equation (14)) are the formal uncertainties (σ) of the parameters, which is the main result of the simulation that will be analysed in Sections 2.4 and 3. The formal error is generally an optimistic representation of the true error of the estimation, the difference between the true and estimated parameters, a quantity which is not known in a real experiment, but we can analyse in this simulation experiment. Comparing the formal error to the true error provides an important piece of information as to how valid the assumptions underlying the covariance analysis are, which can be used to assess whether the formal confidence levels are realistic. It is assumed that: i Observation uncertainties are Gaussian and uncorrelated; ii The "reality" (from which the observations are simulated) and the estimation model (which is used during the least-squares estimation) are identical.
The covariance of a set of parameters (from which the formal errors are extracted) defines the probability distribution of the true errors of those parameters. Under these assumptions, the true errors will, therefore, be merely a realisation of the parameter covariance. In our simulations, we know exactly whether these assumptions are true: all physical effects used to simulate observations are known and, therefore, it is known exactly whether the model complies with the above assumptions, and where these assumptions may be broken.
When differences are introduced between the true orbit and the modelled orbit, e.g., when a certain gravitational effect is not taken into account in the modelled orbit, assumption ii will be violated. The true errors on estimated parameters can be expected to be high with respect to the formal errors to systematically compensate for the difference between real orbit and modelled orbit. In this case, the formal error can be too optimistic, which would lead to a false sense of confidence when only having the formal errors to look at, which is the case when using real MESSENGER and BepiColombo tracking data.

A Priori Information
To stabilise the numerical solution, and incorporate results of past gravitational experiments, the estimation is provided with a priori information about the uncertainties on the parameters, through the a priori covariance matrix P 0 . Current state-of-the-art results for our parameters of interest are given in Table 1. The value that we use for each parameter as a priori information is indicated in bold. The choice made is usually the most accurate experiment to date, except that using past results based on MESSENGER data is avoided and the next best result is used instead, as results by MESSENGER would inherently be (re)produced by our models. For PPN parameter γ, the present known accuracy is 2.3 × 10 −5 from the Cassini solar superior conjunction experiment [12]. BepiColombo will carry out similar superior conjunction experiments during its mission [45], of which it is expected that at best an uncertainty of 1.1 × 10 −6 can be attained for γ [24], a factor 20 increase in accuracy with respect to the Cassini experiment. For both the current known value and hypothetical value of the a priori uncertainty, results will be presented in Section 3.
For parameter J 2 , choosing the 'correct' result from a particular previous experiment is not directly possible. The various reported values are often mutually incompatible, as is apparent from Figure 1. We choose a value of 2.25 × 10 −7 , which is about the average of the recent results. A very conservative a priori uncertainty of 1 × 10 −7 is taken, which is large enough that it does not unduly bias the estimation, and an uncertainty of J 2 is determined largely from the data.

Consider Covariance Analysis
There are various physical parameters that are used for calculations in the integration or estimation, which are not known to perfect accuracy. Assuming that they are constant in this experiment could lead to optimistic covariance for the parameters that we do estimate. In this study, all variables used in calculating orbital dynamics were tested in terms of the influence of the uncertainty by running the simulation with different settings. The following parameters were identified to have a significant effect on the experiment:

•
The gravitational parameter of the Sun Gm ; • PPN parameters α 1 and α 2 ; • Solar angular momentum S ; • The gravitational parameters of the 300 most influential asteroids (interesting discussions on the effect of asteroids on this type of experiment are presented in [7,26]).
To incorporate the uncertainty of such parameters into our analyses, we apply a consider covariance analysis [36]. This method propagates the uncertainty of input parameters to the uncertainty in the estimated parameters. The effect of the consider parameters on the orbit of Mercury is assessed by means of partial derivatives. This operation is a post-processing step: it happens after the parameter estimation itself is completed and increases the covariance matrix.
The parameter covariance matrix including consider covariance matrix is expressed as follows: where P is the covariance matrix that is output of the least-squares estimation, H c is the design matrix for the consider parameters and C is the covariance matrix of the consider parameters.
For the first three parameters, the consider uncertainty is taken from Table 1, for the asteroids, the consider knowledge is taken from INPOP19a [51]. We assume a diagonal consider covariance matrix. Taking this approach allows us to incorporate a number of known sources of systematic error into our covariance analysis.

Incorporating the Nordtvedt Constraint
The Nordtvedt constraint (Equation (8)) provides an extra piece of information that can guide the estimation algorithm. In particular, exploiting this relation prevents high correlation between the PPN parameters with each other and other parameters in the estimation [21,24]. To implement this equation, parameter η is not estimated in the leastsquares algorithm, but is calculated through the Nordtvedt constraint. In the results in Sections 2.4 and 3, the formal error of η is calculated using the property that the variance of a linear combination of parameters is [60]: The formal variance of η can be calculated by applying this property to the Nordtvedt constraint: Var(η) = Var(γ) + 16Var(β) + Var(α 1 ) However, the covariance terms with one or both of α 1 and α 2 (last 5 terms) are neglected, as these terms cannot be calculated from the estimation output if α 1 and α 2 are excluded from the parameter estimation. The consequence is that the formal error of the Nordtvedt parameter is slightly overestimated, as the largest neglected covariance term is the one between β and α 1 . This covariance term was analysed and was found to be positive in value for our simulation, which causes a negative term in Equation (17), decreasing the variance of η.

Validation
Before presenting the results of this study, a short description will be given of the validation of the methodology. Our numerical integration error of Mercury's orbit (i.e., the numerical error due to approximation errors due to the integration scheme or (propagated) machine precision errors) is approximately 1 cm after 20 years of integration, which is assumed to be sufficient considering the noise level of BepiColombo observations. Validation has been performed by comparing the simulation to selected publications for MESSEN-GER [21] and BepiColombo [23,24]. For the validation, the same inputs and acceleration models are used as much as possible. For BepiColombo, there are many comparable studies [25][26][27][28][29][30] with comparable results, the two indicated articles were chosen because the inputs and methods were elaborated upon very extensively, allowing for closer validation.
The formal errors of our estimations can be seen in Table 2, as well as those from literature. Our reproduction of these publications presents uncertainties within a factor ∼2 for all parameters when compared to the result from the publication itself, and within 1.5 in most cases. As there are various design choices to be made in the setup of such a simulation, and it is not unusual to see differences of a factor 2 or more between similar experiments in literature, this reproduction is considered successful. This validation procedure shows that, despite the simplifications that had to be made in the simulation of observations, our setup serves as a valid method for simulating a gravitational experiment using Mercury tracking data, which can reliably estimate parameters in various situations. Table 2. Formal uncertainties (1σ) reported by recent gravity experiments using tracking data of MESSENGER in the case of [21], or simulated tracking data of BepiColombo in the case of [23,24]. Below each result from literature, a reproduction of the formal uncertainties (1σ) of the publications is given, using our software and similar inputs, compared for validation purposes.

Results
In this section, the results of the simulations will be presented. The simulations are set up with the method and inputs described in Section 2, using the same settings that lead to the successful validation of the algorithm. In the various simulations described in the following subsections, only two settings are changed:

1.
We simulate our virtual reality with a static or dynamic solar oblateness; 2.
We try to estimate a static or dynamic oblateness in the parameter estimation.
We start out in Section 3.1 with the typical approach: we assume the real solar oblateness is static and it is only attempted to estimate a constant coefficient J 2 . In Section 3.2, we analyse the situation as suggested in the introduction of this paper: the solar oblateness is dynamic, and a mean value J 2 as well as the amplitude of the oscillation is estimated. Finally, in Section 3.3, the situation is analysed that in reality, the solar oblateness is dynamic, however, this effect is neglected and only a static coefficient J 2 is estimated.
In Section 2.3.1, it was explained what the expectations considering the true and formal errors when the true situation (point 1 above) and the modelled situation (point 2 above) are the same or different. In Sections 2.4, 3.1 and 3.2 results are presented of simulations where the virtual reality and modelled situation are the exact same, hence, the two assumptions of Section 2.3.1 hold. In the simulation performed to generate the results in Section 3.3, the virtual reality and modelled situation are different, hence, the assumptions are broken. In that case, a thorough analysis is performed of the true errors.

Estimation with a Static Solar Oblateness
We first present the results of an estimation with settings that are conventional in comparable experiments (e.g., [21,23,24]): J 2 is a constant parameter and the only solar spherical harmonics coefficient taken into account, it is a static parameter, the dynamic variation of the solar oblateness is not considered. Formal uncertainties of estimated parameters, for MESSENGER-only, BepiColombo-only, and the combined data set, are reported in Table 3. Table 3. Formal uncertainties (1σ) of estimated gravity parameters and solar oblateness when using the method described in Section 2, except that only a static J 2 is included in the virtual reality and the estimation and no other spherical harmonics effects. Simulations are performed for various types of inputs, as indicated in the first column.
The long-term data set does, however, have an advantage for estimating the time variable gravitational parameterĠm /Gm , as an improvement of one order of magnitude can be found with respect to the estimation that only uses BepiColombo tracking data. The effect caused by the time variable gravitational parameter is a weakening of gravitational interactions causing Mercury to slowly drift away from the Sun, which is different to the perihelion shift, which only changes the orientation of the elliptical orbit compared to the Sun but does not change it. Moreover, the linear reduction in Mercury's mean motion as a result ofĠm /Gm will manifest itself as a quadratic signature in time on Mercury's position. We observe that using the combined data set helps to decorrelate this effect from other parameters: correlation coefficients ofĠm /Gm with γ, β and η parameters are around 0.1 compared to 0.5-0.6 in the case of only using BepiColombo data.
To show the potential of the suggested BepiColombo superior conjunction experiment (see Section 2.3.2), results are also shown in Table 3 when an improved a priori uncertainty of γ is used corresponding to the expected outcome of such an experiment: σ γ = 1.1 × 10 −6 . The formal uncertainty that can be obtained for γ using the parameter estimation in this study (relying on dynamics only) is 3.8 × 10 −6 , worse than the formal uncertainty that can be obtained with this suggested superior conjunction experiment. The value of this experiment is clear: improvements of a factor two can be obtained for both β and J 2 .

Estimation with a Dynamical Solar Oblateness
Here, we report formal uncertainties of the estimated parameters when the amplitude of the periodic component of J 2 is implemented in our truth model and this amplitude is included as a parameter to be estimated. A nominal amplitude A = 0.563 × 10 −7 was used, a quarter of the value of J 2 , as suggested by [18], and it was attempted to estimate A J 2 alongside the parameters that were estimated in Section 3.1. Perturbations due to J 4 are not considered. The results for the formal uncertainties are reported in Table 4. Table 4. Formal uncertainties (1σ) of estimated gravity parameters and solar oblateness when using the method described in Section 2. For the solar spherical harmonics, the time-variable J 2 is implemented in the virtual reality and the estimation as dictated by the amplitude A J 2 . Simulations are performed for various types of inputs, as indicated in the first column. The resulting formal uncertainty for A J 2 is 3.7 × 10 −11 , 0.017% the value of the zonal coefficient J 2 itself. The simulation and estimation in this section have been tested for a range of other amplitudes, from A J 2 = 0 to unrealistically high values of A J 2 = 1000 × J 2 . The resulting formal errors of all estimated parameters are virtually identical for any value of A J 2 . Even if a time-variable component does not exist, the constancy of J 2 can be constrained to a level of 0.017%, given the hypothesised values J 2 = 2.25 × 10 −7 and A = 0.563 × 10 −7 , and under the assumption of a purely sinusoidal variation with a fixed phase lag. Compared to the estimation of a static J 2 in Section 3.1, the formal uncertainties of PPN parameters γ and β and Nordtvedt parameter η are unaffected.
The matrix of the estimated parameters (excluding the Mercury initial state, which is correlated with itself, but only negligibly so with these parameters) is shown in Figure 3, for a priori σ γ = 1.1 × 10 −6 . The a priori uncertainty cannot be improved upon by estimating γ from its dynamical signature (see Section 3.1). Consequently, in this case the determination of γ is decoupled from Mercury's dynamics in the simulation, and, therefore, does not occur in the correlation matrix. In previous studies, which do not consider a dynamic J 2 , the PPN parameters are mainly correlated with J 2 . Including the amplitude A J 2 in the estimation yields a correlation with the zonal coefficient variation amplitude. Correlation matrix between estimated parameters in the case that the amplitude A J 2 is estimated with a priori σ γ = 1.1 × 10 −6 (corresponding to the Table 4). The correlation matrix is constructed using covariance matrix P (Equation (14)).
The time variable gravitational parameter of the SunĠm /Gm has a formal error double in value in Table 4 with respect to Table 3. The reason is the correlation of 0.92 betweenĠm /Gm and A J 2 , indicating that the separate effects are more difficult to distinguish as they have a similar perturbation on the orbit of Mercury. It is expected that these effects cannot be distinguished due to the coincidental timing of the two space missions to Mercury, see Figure 2. Both spacecraft orbit Mercury during a solar maximum. As there are no data in the solar minimum, the estimation algorithm is not able to deduct from the observations that the correction due to the time-varying J 2 is a sinusoidal effect, but might consider it a linear effect instead.

What If J 2 Is Periodic, but It Is Not Estimated?
With the suggestion that the value of J 2 is dynamic, with periodicity following the solar activity cycle, the question can be raised: what happens if a periodic variability exists, but it is not considered in an estimation? In fact, this is the approach that has been taken thus far, and it is natural to consider what the consequences of this choice have been for tests of GR.
Here, we present simulation results to test this, by setting a dynamic J 2 in the truth model with a static value of 2.25 × 10 7 , while omitting the effect of time-variable J 2 in the estimation model. The simulation is performed for a range of amplitudes A J 2 from 0.1% to 100% the value of J 2 .
The resulting true errors of the estimated parameters are plotted against the used value of A J 2 in Figure 4. In addition, the formal error level is indicated for each separate mission, and the combined missions (which are practically independent of the chosen value of A J 2 ). In all of the plots, two distinct regions can be identified. For lower values of A J 2 , the true error is around the formal error, as would be expected. This indicates that the timevariable perturbation of J 2 has a negligible effect and no problems will be encountered in the estimation if it is omitted in the dynamical modelling. In the case of only using MESSENGER data, this behaviour is in most cases observed for larger values of A J 2 as well, as the change in orbit caused by the time-variability is small enough to go undetected.
When increasing A J 2 , the formal errors stay the same while the true errors rise. When using only the BepiColombo or combined data, a clear linear trend in true error can be seen on the right side of the plots. High true to formal error ratios are a sign of a poor estimation result, as the true value of the parameter is far outside the confidence interval given by the least-squares algorithm.
As an indication of where formal uncertainties are no longer a good representation of the true error, and the influence of the unmodelled A J 2 therefore becomes significant, we look at the point at which the true error is three times higher than the formal uncertainty. This means that the true value of the parameter lies outside of the 99% confidence interval (3σ) given by the least-squares estimation, specifically due to the effect of the periodic J 2 effect, which gets absorbed into other parameters. Table 5 indicates where this point is reached for all the cases of Figure 4. Table 5. Amplitudes (A J 2 taken relative to J 2 ) at which the true to formal error ratios exceed 3 in the linear regions in Figure 4. For parameter β using only MESSENGER data, the true-to-false error ratio never exceeds 3 for the implemented amplitudes and the linear region does not appear.

MESSENGER
BepiColombo Combined

Discussion
In this study, the combined analysis of simulated BepiColombo and MESSENGER data was performed, to assess whether a time-variable J 2 is observable in the long-term tracking data, and whether it will influence tests of GR. The results in Table 3 show that the results for parameters γ, β, η and J 2 do not benefit from the combined data set compared to an estimation which only uses BepiColombo data. However, for the time-variable gravitational parameter of the SunĠm /Gm , an improvement of one order of magnitude was found when using the combined data set. An improvement in this parameter can be used to test gravitational theories that predict a varying value for the Newtonian constant G, such as scalar-tensor theories, and provide deeper insight into possible deviations from GR [5,42]. In addition, the results in Tables 3 and 4 show the relevance of conducting a solar superior conjunction experiment with the BepiColombo spacecraft to determine PPN parameter γ.
It is shown in Section 3.2 that the amplitude of the variation of J 2 can be determined to a formal uncertainty of 3.7 × 10 −11 . Considering that currently no reliable constraint is available for the temporal variation of J 2 , any estimated value for the amplitude with this formal uncertainty will be a very valuable result. Many temporal variations suggested in heliophysics (e.g., recent examples by [8,14,16,18,19]) could be rejected based on this experiment. In the same manner, if the temporal variation is too small to measure or does not exist (as, e.g., suggested by [17]), an upper limit for the temporal variation can be determined with the obtained formal uncertainty. Any answer will allow the field of heliophysics to refine their theoretical models of the solar shape and interior further.
Finally, in Section 3.3, the results of a parameter estimation are shown when a temporal variation is ignored and constant J 2 coefficient is assumed. This mirrors the current situation in data analysis, where we stress that we do not perform any direct estimate of how big the J 2 variability may be. It is seen in Figure 4 that especially the estimation of parameterĠm /Gm is sensitive to not estimating a temporal variation, as relatively high true errors appear for low amplitudes of the periodic J 2 compared with the other estimated parameters, i.e., the linear region is in the plot is present at lower amplitudes. This result can be used to already analyse possible amplitude size of J 2 , when combining two pieces of information:

1.
This study shows that if the amplitude of J 2 is larger than roughly 1 × 10 −6 , there should be true errors in the estimation ofĠm /Gm of 10 −12 or higher; 2.
It is unlikely that the estimate forĠm /Gm from [21] has a true error that is in the order of 10 −12 or higher, caused by ignoring a temporal variation of J 2 . If such a high true error would be present, it would have manifested itself in the residuals and/or estimates of other parameters, such that the result of the MESSENGER experiment would be in disagreement with the other experiments, which are independent of the gravity field of the Sun. From this observation, and our results, it can already be deduced using Figure 4 that the amplitude of a temporal variation of J 2 cannot be higher than roughly 1 × 10 −6 .
As formal uncertainty levels will decrease by one to two orders of magnitude with the introduction of the BepiColombo tracking data, the risk becomes even higher that parameters will be estimated with high true errors compared to their formal uncertainty. Table 5 indicates that for an amplitude as low as 0.04%, the results of the estimation oḟ Gm /Gm will be significantly wrong (true error > 3σ) due to the ignored temporal variation of J 2 . If this situation occurs, the parameter estimation will result in false parameter values, while giving relatively high confidence in the results. The consequence is that, if GR is the true theory of gravity, the result of tests of the theory will be a false negative one. This could lead to confusion and controversy in the field of gravitational physics, similar to the situation in the 1960s when measurements of the solar oblateness seemed to contradict GR as well [8,64]. This study shows how to prevent the unnecessary confusion: take notice of the potential temporal variation of the solar shape and estimate it alongside other gravity parameters.

Conclusions
The aim of this study was to investigate the impact of a temporal variation of the solar gravitational field on experiments of gravitational theory that exploit the relativistic perturbations on the orbit of Mercury. A simulated reality and parameter estimation algorithm were used to simulate the experiment that can be conducted by estimating relevant parameters using the MESSENGER mission and upcoming BepiColombo mission. The simulation of the experiment is thoroughly validated and tested and shows comparable results to similar studies.
Combining the MESSENGER and BepiColombo data sets is advantageous for determining the time variable gravitational parameter of the SunĠm /Gm . In addition, it allows us to estimate the amplitude of a periodic variation of the solar zonal coefficient J 2 with a formal uncertainty of 0.017% the value of the coefficient itself. Considering that currently no reliable estimate is available for this periodic variation, this result will be very valuable for gravitational physics and heliophysics.
Our analysis showed that if a periodic variation of J 2 has an amplitude higher than 5%, constraints onĠm /Gm which were obtained using MESSENGER tracking data would contain high true errors to the point that it would contradict results of independent experiments. As this has not been the case, it is expected that the temporal variation is lower than 5%.
It is also shown in Figure 4 that if a periodic variation exists that is higher than 0.04% of the value of J 2 , but it is ignored during the parameter estimation using BepiColombo data, it will lead to spurious conclusions in data analysis. Specifically, doing so will result in true errors of gravitational physics parameters which are substantially higher than heir formal errors, due to the signature of A J 2 leaking into these parameters. It is, therefore, highly recommended that when performing these types of experiments using the upcoming BepiColombo tracking data, note should be taken of possibility that a temporal variation of the solar gravitational field could significantly affect the outcome of their experiments. The method to avoid this situation is to include the temporal variation in the parameter estimation.
It should be noted that the sinusoidal model for the variation of the J 2 coefficient is only a first-order approximation, and it is recommended to test more realistic and diverse representations of the temporal variation, for instance, such that it better corresponds to the magnetic solar activity cycle (e.g., based on sunspot numbers).