A Comparative Study on Evolutionary Multi-objective Optimization Algorithms Estimating Surface Duct

The problem of atmospheric duct inversion is usually solved as a single objective optimization problem. Based on ground-based Global Positioning System (GPS) phase delay and propagation loss, this paper develops a multi-objective method including the effect of source frequency and receiving antenna height. The diversity and convergence of solution sets are evaluated for seven multi-objective evolutionary algorithms with three performance metrics: Hypervolume (HV), Inverted Generational Distance (IGD), and the averaged Hausdorff distance (Δ2). The inversion results are compared with the simulation results, and the experimental comparison is conducted on three groups of test situations. The results demonstrate that the ranking of algorithm performance varies because of the different methods used to calculate performance metrics. Moreover, when the algorithms show overwhelming performance using performance metrics, the inversion result is not more close to the real value. In the comparison of computational experiments, it was found that, as the retrieved parameter dimension increases, the inversion result becomes more unstable. When the observed data are sufficient, the inversion result seems to be improved.


Introduction
Atmospheric duct is a layer of the atmosphere that causes electromagnetic waves to bend toward the earth's surface. The inversion of atmospheric duct structure affects the normal operation of electromagnetic weapon and communication systems. Therefore, various methods of sounding refractivity profiles have been put forward [1][2][3][4][5]. Based on the relations between radar echo and refractivity, the technique of refractivity from clutter (RFC) was developed. The technique changes the method of detecting atmospheric duct. Subsequent studies by Gerstoft [6,7], Yardim [8], Sheng [9], and other scholars [10][11][12] have been mainly concerned with improving the inversion algorithm. When Hitney [13] demonstrated the capability of estimating the trapping layers base height from ultrahigh frequency (UHF) signal strengths, the method of remote sensing refractivity profiles from Global Positioning System (GPS) signals was developed. Anderson [14] tried to infer atmospheric duct with phase delay, but there was a certain gap between the inversion results and the real refractivity profile. Liao et al. presented the inversion method of low-elevation ground-based GPS, combining phase delay and propagation loss information to retrieve surface duct parameters, and recently proposed the non-dominated sorting genetic algorithm II combined with simulated annealing algorithm (NSSAGA) minimized simultaneously. Accordingly, the Pareto front needs to be introduced to obtain the optimal decision variables. Suppose The non-dominated set of the entire feasible decision is called the Pareto optimal solution set (PS), and the boundary defined by the set of all point mapped from the Pareto optimal set is called the Pareto front (PF).

Evaluation Metrics
Evolutionary algorithms perform well in finding multi-objective solutions and can ultimately obtain a non-dominated solution set. Therefore, how to evaluate the performance of these algorithms is also extremely important. There are two ways to judge the quality of non-dominated solution sets. First, after the decision space is projected into the target space, it is dependent on the distance between the retrieved PF and the actual PF surface. The smaller the distance, the better the convergence. Second, it depends on the distribution of the PF surface. The more uniform the distribution, the better the PF. In this paper, three metrics are used to evaluate the performance of five kinds of multi-objective algorithms on surface duct inversion.

Hypervolume
Zitzler and Thiele proposed HV in 1999, and it has been widely used since 2003 [26,27]. HV is used to calculate the hypervolume of a non-dominated set and space reference point, which can be simply constructed as the solution set corresponding to the worst target vector. The expression is as follows: Here, Q is the solution set of the non-dominant solution front, and |Q| is the number of non-dominant solution elements. For each non-dominated solution i ∈ Q, a hypercube v i is composed of reference points and members i. Larger metric values indicate that the Pareto solution set obtained can be more widely distributed. Therefore, the bigger the value, the better the algorithm performance.

Inverted Generational Distance
IGD is the reversal of Generational Distance (GD). GD was originally proposed to evaluate the distance between a non-dominated solution set and the real PS. It is an index to evaluate the convergence. IGD is the average value of the minimum distance between the uniform point on the PF and the non-dominated solution set. The calculation is as follows [28]: Here, P * = {p 1 , p 2 , . . . , p w } represents the uniform points covering the optimal PF surface, and d(i, NDS) is the minimum Euclidean distance from the point i to the individual in the non-dominated solution set (NDS). A is the approximate Pareto solution set obtained by the algorithm. f max m and f min m are the maximum and minimum respectively of the m-th objective function values of the real solution set P * . |p * | and |A| are the number of the real solution set elements and the approximate Pareto solution set elements, respectively. Since P * can represent the real PF of the whole problem, the convergence and diversity of the PF obtained is better when the IGD value is lower.
Considering that the IGD value is related to the real PF, we set different real PFs for different problems (seeing Figure 1). The solution space can be divided into two, four, six and eight dimensions according to the height of receiving antenna and the frequency of transmitting source during surface duct inversion, marking the different problems as GPS1, GPS2, GPS3, GPS4 and GPS5. When the solution space is six dimensions, it can be divided into two types of problem, GPS3 and GPS4. The solution of the GPS3 condition is to adopt two sets of data from the same height and different frequencies, and a set of data from other heights. The solution of the GPS4 condition is to consider three sets of data from different heights and frequencies. The contribution of those real PFs to the HV is very small, and IGD is less sensitive to the PF simplification [29]. Therefore, a reference point set is constructed as the real PF by selecting non-dominated solutions from all the obtained solutions by 30 runs ofeach algorithm. elements and the approximate Pareto solution set elements, respectively. Since * can represent the real PF of the whole problem, the convergence and diversity of the PF obtained is better when the IGD value is lower. Considering that the IGD value is related to the real PF, we set different real PFs for different problems (seeing Figure 1). The solution space can be divided into two, four, six and eight dimensions according to the height of receiving antenna and the frequency of transmitting source during surface duct inversion, marking the different problems as GPS1, GPS2, GPS3, GPS4 and GPS5. When the solution space is six dimensions, it can be divided into two types of problem, GPS3 and GPS4. The solution of the GPS3 condition is to adopt two sets of data from the same height and different frequencies, and a set of data from other heights. The solution of the GPS4 condition is to consider three sets of data from different heights and frequencies. The contribution of those real PFs to the HV is very small, and IGD is less sensitive to the PF simplification [29]. Therefore, a reference point set is constructed as the real PF by selecting non-dominated solutions from all the obtained solutions by 30 runs ofeach algorithm.

The Averaged Hausdorff Distance
An 'averaged Hausdorff distance', , was first proposed as one possible remedy for the Hausdorff distance [30,31]. The indicator averages the distances between entries of outcome set and Pareto front approximation. Hence, it is more suitable for multiobjective evolutionary algorithms because of single (or few) outliers generated.
is composed of the modified

The Averaged Hausdorff Distance
An 'averaged Hausdorff distance', ∆ s , was first proposed as one possible remedy for the Hausdorff distance [30,31]. The indicator averages the distances between entries of outcome set and Pareto front approximation. Hence, it is more suitable for multiobjective evolutionary algorithms because of single (or few) outliers generated. ∆ s is composed of the modified indicators GD and IGD. Let Q = {y 1 , . . . , y n }, p * = {p 1 , . . . , p w } ⊂R k be non-empty and finite, then Due to its composition of modified GD and IGD, ∆ s has stronger metric properties than IGD [30]. Refering to [30], it is concluded that the smaller s, the easier single outliers will be punished by ∆ s . Here, we focuse on the ∆ 2 scenario, i.e., the Euclidean norm.

Phase-Delay Model and Propagation-Loss
There are two main effects when GPS signals pass through the troposphere. First, the travel speed of the signals becomes slower in the atmosphere. Second, the path of GPS signals bends. Both delays are caused by changes in the atmospheric refractivity, which can be expressed by Smith and Weintraub's equation [32]: Here, the unit of T is K, and the unit of P and e is hPa. Assuming a spherically symmetric atmosphere, Sheng et al. presented the concrete calculation process of phase delay [33]. The excess phase path is defined as: The phase path S is determined as: In (9), x = rn(r) denotes the refractive radius. The ray path in a vacuum S 0 can be determined as: To model the GPS signal propagation, the single-way propagation loss L(x, z) of ground-based GPS signals is used to describe the objective function [33]: In (11), f represents the propagation factor and C is a constant parameter. In a rectangular coordinate system, the propagation can be calculated as: If the initial field is provided, the split-step Fourier transform (SSFT) is used to calculate low-elevation GPS signal propagation in a tropospheric duct. This can be substituted by a sine or cosine transform because the bottom boundary approximates to a perfectly conducting surface. The SSFT solution of the propagated wave equation is expressed as [16]: In (13), F and F −1 are the Fourier transform and inverse Fourier transform, respectively. Here, ∆x is the range step, p = 2k sin θ represents the transform variable for which θ is the angle from the horizontal, and u(x, z) is the initial field. Balvedi and Walter [34] describe a more detailed process for solving the parabolic equation.
The modified refractivity M, which considers the Earth's curvature, is related to the radio refractivity N as follows [16]: Differentiating with respect to r, we obtain: In the above, r is the altitude in meters and M is the modified refractivity. The unit of dM/dr is M-units/km. When the modified refractivity gradient is less than 0 M-units/km, an atmospheric duct occurs. The atmospheric structure, especially a surface duct, may delay the GPS signal at low elevation. Hence, this paper uses the GPS phase delay and propagation loss to retrieve the surface duct.

Parameterized Model
The atmospheric refractivity profile can be simulated by a parameterized model. The influence of a surface duct on electromagnetic wave propagation is more significant than that of an evaporation duct, and is usually represented by a three-segment linear model [16]: Here, M 0 is the corrected refractivity at sea level, h 1 is the trapping-layer base height, c 1 is the base slope, h 2 is the inversion-layer thickness, and c 2 is the slope from h 1 to h 1 + h 2 . The slope of the top layer is 0.118 M-units/m if we assume that the layer satisfies the standard refractivity condition.

Performance Comparison
To reconstruct the refractivity profile, a multi-objective cost function related to the ordinary least-squares (OLS) is defined [15,16]. The simulations assume a GPS elevation angle of 1 • and a beam width of 16 • . Here, we select seven algorithms for a computational test according to the characteristics of evolutionary algorithms. The specific parameter settings for each algorithm are shown in Table 1.

GrEA
The grid division: div = 45 for 2 objectives, div = 15 for 4 objectives and 6 objectives, and div = 8 for 8 objectives  Table 2 shows the parameter settings for all test problems. In Table 2, GPS1, GPS2, GPS3 and GPS5 represent the problems when the solution space is two, four, six and eight dimensions, respectively, and the upper and lower bounds of the decision space are also given. The decision space includes slope parameters c 1 and c 2 , duct height h 1 and duct thickness h 2 , the transmitting frequency and receiving antenna height. In the sixth column, [1200, 1600] is the range of transmitting frequency, and [0, 200] is the range of antenna height. They are needed in Sections 3.2 and 3.3 above. The first number is transmitting frequency, and the second number is antenna height in parentheses. In the computational test, although the solution space of GPS4 is also six dimensions, the received signal is from a different frequency at the same antenna height. The received signals of GPS3 are from two sets of antennas at the same height, which receive signals from different transmitting frequencies. In the following, we compare the performance of seven algorithms after 30 runs of each algorithm, and present a box diagram and table of the evaluation metrics. We also compare the inversion value with the real value.

Excluding Antenna Height and Transmitting Frequency
Firstly, if we know the antenna height and transmitting frequency, then we only need to retrieve the parameter of refractivity profile. Figure 2 shows box plots representing the distribution of three metrics after 30 runs of each algorithm. A box plot (or box and whisker diagram) is a standardized way of displaying the distribution of data based on the five number summary: minimum, first quartile, median, third quartile, and maximum. In the simplest box plot the central rectangle spans the first quartile to the third quartile (the interquartile range or IQR). A segment inside the rectangle shows the median, and 'whiskers' above and below the box show the locations of the minimum and maximum. Outlying values are marked as '+'.
As Figure 2 shows, the conventional NSGA-II and NSGA-III both perform well on surface duct inversion with HV and IGD as metrics. MOEAD performs poorly on problem GPS1 and problem GPS2. In other problems, if HV is used as the evaluation index, the algorithm performs poorly. However, if IGD is used as the evaluation index, the algorithm performs well. With HV as an indicator, HypE and GrEA perform similarly and both exhibit good performance. With IGD as an indicator, HypE is significantly inferior to GrEA. The reason for the difference in HV and IGD can be found in [25]. If the convergence of the PF has no difficulty, then the performance of the algorithm depends mainly on the solution distribution. However, HV and IGD use different methods to calculate solution distribution diversity. Therefore, if HV and IGD are taken as indicators, there will inevitably be inconsistent results. Meanwhile, it can be noticed that ∆ 2 almost has similary variation characteristics. We will only give the statistical results of ∆ 2 instead of IGD in the followed tables. indicator, HypE is significantly inferior to GrEA. The reason for the difference in HV and IGD can be found in [25]. If the convergence of the PF has no difficulty, then the performance of the algorithm depends mainly on the solution distribution. However, HV and IGD use different methods to calculate solution distribution diversity. Therefore, if HV and IGD are taken as indicators, there will inevitably be inconsistent results. Meanwhile, it can be noticed that almost has similary variation characteristics. We will only give the statistical results of Δ instead of IGD in the followed tables.  Tables 3 and 4 show the mean and variance respectively of the HV and Δ metrics. In these tables, M represents the number of cost functions (solution space), and D represents the number of parameters to be retrieved (decision space). Excluding antenna height and transmitting frequency as inversion parameters is marked as condition 4, and including transmitting frequency is marked as 5. Including both is marked as 6. The gray background in the table indicates that the algorithm performs well in the inversion process, and the boldface indicates that the performance is poor   Tables 3 and 4 show the mean and variance respectively of the HV and ∆ 2 metrics. In these tables, M represents the number of cost functions (solution space), and D represents the number of parameters to be retrieved (decision space). Excluding antenna height and transmitting frequency as inversion parameters is marked as condition 4, and including transmitting frequency is marked as 5. Including both is marked as 6. The gray background in the table indicates that the algorithm performs well in the inversion process, and the boldface indicates that the performance is poor.
It can be seen from Tables 3 and 4 that when the decision space is under the same dimension, HV gradually decreases while ∆ 2 gradually increases as the dimension of the solution space increases. It shows that the poor definition of the inverse problem is enhanced, and that the evolutionary algorithm has difficulties converging to the real solution and obtaining more possible solutions. When we use HV as an indicator, HypE and NSGA-II are significantly superior to other algorithms in GPS inversion problems. MOEAD, by contrast, has an absolute disadvantage, which is roughly in line with the box plots. Some algorithms also perform well in specific problems, such as NSGA-III. Although there is no gray area in 15 problems, HV values are similar to NSGA-II in most cases. However, if ∆ 2 is used as the performance index, the conclusion is different. The performance of NSGA-II are consistent with Table 3, and there are more bad behavior with HypE's performance. This may be due to differences in HV and ∆ 2 diversity assessment mechanisms, as can be inferred from their calculation formulae. Figure 3 shows the spatial distribution of decision space and solution space, where the left plane is the three-dimensional distribution of the first three parameters, and the right plane is the distribution of the first three cost functions. Figure 3b shows the solution distribution of two-objective functions. It can be seen that the distribution of MOEAD is scattered, and that other algorithms approach the real value.
9.4444 × 10 −6 (4.3702 × 10 −7 ) 9.4124 × 10 −6 (4.4887 × 10 −7 ) 9.3182 × 10 −6 (5.2555 × 10 −7 ) 9.5819 × 10 −6 (3.6928 × 10 −7 ) 9.4182 × 10 −6 (3.2268 × 10 −7 ) 9.1581 × 10 −6 (6.3368 × 10 −7 )   Therefore, it is more reasonable to evaluate with IGD. In Figure 3a, the solution sets of the seven algorithms are very scattered, indicating that the diversity and convergence of all multi-objective functions are more similar in the two-dimensional problem. The results are consistent with those in Tables 3 and 4. KnEA, marked as green points, diffuses to the top and back of the PF in Figure 3f,h,j, and is confined in a narrow space. This phenomenon indicates that the performance of KnEA is poor, and that the solution space diversity is monotonous. If we compare the solution space alone, MOEAD and HypE may outperform other algorithms in surface duct inversion problems. It can be seen from Tables 3 and 4 that when the decision space is under the same dimension, HV gradually decreases while gradually increases as the dimension of the solution space increases. It shows that the poor definition of the inverse problem is enhanced, and that the evolutionary algorithm has difficulties converging to the real solution and obtaining more possible solutions. When we use HV as an indicator, HypE and NSGA-II are significantly superior to other algorithms in GPS inversion problems. MOEAD, by contrast, has an absolute disadvantage, which is roughly in line with the box plots. Some algorithms also perform well in specific problems, such as NSGA-III. Although there is no gray area in 15 problems, HV values are similar to NSGA-II in most cases. However, if is used as the performance index, the conclusion is different. The performance of NSGA-II are consistent with Table 3, and there are more bad behavior with HypE's performance. This may be due to differences in HV and Δ diversity assessment mechanisms, as can be inferred from their calculation formulae. Figure 3 shows the spatial distribution of decision space and solution space, where the left plane is the three-dimensional distribution of the first three parameters, and the right plane is the distribution of the first three cost functions. Figure 3b shows the solution distribution of two-objective functions. It can be seen that the distribution of MOEAD is scattered, and that other algorithms approach the real value.
Therefore, it is more reasonable to evaluate with IGD. In Figure 3a, the solution sets of the seven algorithms are very scattered, indicating that the diversity and convergence of all multi-objective functions are more similar in the two-dimensional problem. The results are consistent with those in Table 3 and 4. KnEA, marked as green points, diffuses to the top and back of the PF in Figure 3f, h, and j, and is confined in a narrow space. This phenomenon indicates that the performance of KnEA is poor, and that the solution space diversity is monotonous. If we compare the solution space alone, MOEAD and HypE may outperform other algorithms in surface duct inversion problems.    Figure 4 shows the comparison of simulated and retrieved profiles. The absolute errors between the inversion and simulated atmospheric refractivity profiles are given on the right plane. Although HypE performs well in performance tests, the results still show a gap between retrieved and simulated value, and are not as accurate as other algorithms. This indicates that HypE still has room for improvement. By contrast, MOEAD and NSGA-III can construct atmospheric refractivity profiles with relative accuracy in various conditions, indicating that they are more robust. Although KnEA obtained the worst metrics values and the worst retrieved results in the two-dimensional case, the   Figure 4 shows the comparison of simulated and retrieved profiles. The absolute errors between the inversion and simulated atmospheric refractivity profiles are given on the right plane. Although HypE performs well in performance tests, the results still show a gap between retrieved and simulated value, and are not as accurate as other algorithms. This indicates that HypE still has room for improvement. By contrast, MOEAD and NSGA-III can construct atmospheric refractivity profiles with relative accuracy in various conditions, indicating that they are more robust. Although KnEA obtained the worst metrics values and the worst retrieved results in the two-dimensional case, the inversion results occupy the middle level in other cases. The comparison of inversion results shows that the evaluation metrics can only evaluate the distribution of algorithms in decision space and solution space, but cannot be used to judge the actual inversion result directly. Therefore, it is necessary to select an appropriate algorithm according to the actual requirements in practical problems. By comparing Figure 4b,d,h,j, we can see whether an individual algorithm is improving step by step with an increase in the amount of observed data, such as MOEAD. However, in the GPS4 problem, the inversion results of MOEAD are no better than those of previous results. The comparison of inversion results shows that the evaluation metrics can only evaluate the distribution of algorithms in decision space and solution space, but cannot be used to judge the actual inversion result directly. Therefore, it is necessary to select an appropriate algorithm according to the actual requirements in practical problems. By comparing Figure 4b,d,h, and j, we can see whether an individual algorithm is improving step by step with an increase in the amount of observed data, such as MOEAD. However, in the GPS4 problem, the inversion results of MOEAD are no better than those of previous results.

Including Transmitting Frequency
Given that microwave signals of unknown frequencies are sometimes received, we can try to retrieve the atmospheric duct by collecting these signals. Therefore, transmitting frequency is also used as an inversion parameter in this subsection. Figure 5 shows box plots with five dimensions of decision space. NSGA-II remains the top performer with the HV metrics, followed by NSGA-III. MOEAD is the worst performer, followed by KnEA and GrEA. With IGD or as the evaluation metric, similar conclusions can be obtained. However, MOEAD does not obtain the worst values in problem GPS4 and problem GPS5. KnEA achieves the worst values in problem GPS4, and HypE obtains the worst Δ values in problem GPS5.

Including Transmitting Frequency
Given that microwave signals of unknown frequencies are sometimes received, we can try to retrieve the atmospheric duct by collecting these signals. Therefore, transmitting frequency is also used as an inversion parameter in this subsection. Figure 5 shows box plots with five dimensions of decision space. NSGA-II remains the top performer with the HV metrics, followed by NSGA-III. MOEAD is the worst performer, followed by KnEA and GrEA. With IGD or ∆ 2 as the evaluation metric, similar conclusions can be obtained. However, MOEAD does not obtain the worst ∆ 2 values in problem GPS4 and problem GPS5. KnEA achieves the worst ∆ 2 values in problem GPS4, and HypE obtains the worst ∆ 2 values in problem GPS5.  Figure 6 shows the spatial distribution of decision space and solution space. When the decision space is five-dimensions, the solution set of MOEAD is still scattered, which is similar to the previous result. However, the decision space of seven algorithms becomes stratified in problem GPS1. Except for MOEAD, the inversion parameters of the algorithms all approximate to the real parameters, and the corresponding HV and IGD values also reflect this phenomenon well. In the solution space, KnEA, marked as green points, is more evenly distributed on the PF. Thus the inversion result of KnEA is improved. Comparing the inversion results of Figure 4 with Figure 7, we can reach similar conclusions. In Figure 6, f 2 of the PF is close to 0.011, which is smaller than the previous approximation value in Section 3.1; other parts of the PF have little change. Therefore, it can be inferred that the retrieved parameters may be improved, as shown by the inversion results in Figure 7. Figure 7 shows the comparison of simulated and retrieved profiles. When the solution space is two dimensions or four dimensions, the worst inversion results are obtained from HypE, MOEAD and NSGA-II. As the dimensions increase, NSGA-III starts to exhibit slightly better performance. Compared with Figure 4, the inversion result of GrEA improves when the decision space dimension increases in the same dimension of solution space. Similar rules are found for HypE and KnEA. NSGA-III, on the other hand, performs worse in some problems, especially in low-dimensional solution space. The inversion results of Two_Arch2 are not affected by the decision space dimension.
KnEA, marked as green points, is more evenly distributed on the PF. Thus the inversion result of KnEA is improved. Comparing the inversion results of Figure 4 with Figure 7, we can reach similar conclusions. In Figure 6, f2 of the PF is close to 0.011, which is smaller than the previous approximation value in subsection 3.1; other parts of the PF have little change. Therefore, it can be inferred that the retrieved parameters may be improved, as shown by the inversion results in Figure 7.  Figure 7 shows the comparison of simulated and retrieved profiles. When the solution space is two dimensions or four dimensions, the worst inversion results are obtained from HypE, MOEAD and NSGA-II. As the dimensions increase, NSGA-III starts to exhibit slightly better performance. Compared with Figure 4, the inversion result of GrEA improves when the decision space dimension increases in the same dimension of solution space. Similar rules are found for HypE and KnEA.

Including Antenna Height and Transmitting Frequency
If we obtain a microwave signal with unknown frequency and antenna height, we need to retrieve simultaneously the atmospheric duct parameters, antenna height, and source frequency.

Including Antenna Height and Transmitting Frequency
If we obtain a microwave signal with unknown frequency and antenna height, we need to retrieve simultaneously the atmospheric duct parameters, antenna height, and source frequency. Figure 8 shows box plots of all algorithms in six dimensions of decision space. According to the median values, NSGA-II and NSGA-III achieve the best HV and ∆ 2 values. MOEAD performs poorly in all problems except for GPS5. Compared with the previous values in Sections 3.1 and 3.2, the abnormal values of the metrics have increased significantly, which further verifies the difficulties with the inverse problem. Figure 9 shows the three-dimensional distribution of decision space and solution space. In general, S 1 , S 2 and S 3 are all scattered and untidy in the solution space. In the solution space, f 1 , f 2 and f 3 form a rod-shaped structure slanting on the plane f 3 = 0.011. Some yellow points representing MOEAD are scattered on the back of the PF. The solution space becomes convergent and the diversity of decision space is enhanced. This indicates that the disturbance of the inverse problem is enhanced with the increase in decision space dimension. In other words, it is more difficult to obtain the exact value through these multi-objective algorithms.
To illustrate the similarity of objective functions, we use multidimensional scaling method (MSD) to visualize our results. MDS is a powerful statistical method that maps proximity data on pairs of objects into distances between points in a multidimensional space [35]. The space is usually two-dimensional, sometimes also three-dimensional. Here, it is adopted to visualize results' structure. Figure 10 shows two-dimensional graph of objective functions. We note that Two_Arch2's results always has different location from other algorithms. In the same scenario, the larger the objective functions dimension is, the smaller the distance among firms are. For example, these results of GPS1 (two-dimensional problem) has larger position distance than these of GPS5 (eight-dimensional problem) in Section 3.1. On ther other hand, these algorithms seems have smaller position distance when increasing known conditions to solve the lower dimensional problems. However, it is obvious that the distances among different algorithms are larger in Section 3.3 than in Section 3.1 for GPS5 problem. The reason of results distribution may be that the increase of constraint conditions does not improve non-uniqueness caused by the number of objective functions.  Figure 9 shows the three-dimensional distribution of decision space and solution space. In general, S1, S2 and S3 are all scattered and untidy in the solution space. In the solution space, f1, f2 and f3 form a rod-shaped structure slanting on the plane f3 = 0.011. Some yellow points representing MOEAD are scattered on the back of the PF. The solution space becomes convergent and the diversity of decision space is enhanced. This indicates that the disturbance of the inverse problem is enhanced with the increase in decision space dimension. In other words, it is more difficult to obtain the exact value through these multi-objective algorithms.   Figure 11 shows the comparison of simulated and retrieved profiles in eight-dimensional decision space. Two_Arch2 is relatively stable and the inversion results are not improved. However, the absolute error of atmospheric refractivity obtained by the other algorithms is less than the results in Sections 3.1 and 3.2. This phenomenon is consistent with the solution space distribution. In the solution space distribution, all the cost functions converge on the real PF surface. It can be concluded that although the number of inversion parameters increases, and the difficulties caused by inversion increases, the multi-objective algorithms can still retrieve the real values with their excellent search ability. In problem GPS1, the maximum absolute error obtained by GrEA, MOEAD and NSGA-II hovers at 10 N-units, whereas the inversion results of HypE and Two_Arch2 approach 30 N-units. By comparing NSGA-II and NSGA-III, it can be seen that NSGA-II works well in low-dimensional solution space, especially in two-dimensional space. The maximum absolute error is 12 N-units. NSGA-III always maintains the maximum absolute error around 20 N-units. Comparing the corresponding Figure 11b,d,h and j in the three conditions, the absolute errors of the algorithms, except for GrEA and KnEA, all decrease as the dimension of decision space increases. This indicates that although the dimension of the inversion problem increases, the results obtained can still be improved appropriately. Comparing the results in the three conditions, it can be found that HypE cannot obtain good profiles in all three cases, and may not be applicable when using GPS information to retrieve surface duct.  To illustrate the similarity of objective functions, we use multidimensional scaling method (MSD) to visualize our results. MDS is a powerful statistical method that maps proximity data on pairs of objects into distances between points in a multidimensional space [35]. The space is usually two-dimensional, sometimes also three-dimensional. Here, it is adopted to visualize results' structure. Figure 10 shows two-dimensional graph of objective functions. We note that Two_Arch2's results always has different location from other algorithms. In the same scenario, the larger the objective functions dimension is, the smaller the distance among firms are. For example, these results of GPS1 (two-dimensional problem) has larger position distance than these of GPS5 (eight-dimensional problem) in subsections 3.1. On ther other hand, these algorithms seems have smaller position distance when increasing known conditions to solve the lower dimensional problems. However, it is obvious that the distances among different algorithms are larger in subsections 3.3 than in subsections 3.1 for GPS5 problem. The reason of results distribution may be that the increase of constraint conditions does not improve non-uniqueness caused by the number of objective functions.  Figure 11 shows the comparison of simulated and retrieved profiles in eight-dimensional decision space. Two_Arch2 is relatively stable and the inversion results are not improved. However, the absolute error of atmospheric refractivity obtained by the other algorithms is less than the results in subsections 3.1 and 3.2. This phenomenon is consistent with the solution space distribution. In the solution space distribution, all the cost functions converge on the real PF surface. It can be concluded that although the number of inversion parameters increases, and the difficulties caused by inversion increases, the multi-objective algorithms can still retrieve the real values with their excellent search ability. In problem GPS1, the maximum absolute error obtained by GrEA, MOEAD and NSGA-II

Conclusions
The traditional method of detecting atmospheric duct inversion is RFC technology. Based on previous work [15,16], this paper uses the phase delay and propagation loss of ground-based GPS to retrieve the atmospheric refractivity environment with a multi-objective algorithm. Given that different approaches have different search abilities, we divided these tests into three groups according to whether receiving antenna height and transmitting frequency were included as parameters. The performance of seven evolutionary algorithms was evaluated by HV, IGD and metrics. Based on the HV results, NSGA-II remains the top performer in the three test groups, followed by NSGA-III. MOEAD is the worst performer. However, the behavior of the HV and IGD (or ) metrics may differ in some cases, particularly for MOEAD, HypE and GrEA. When no factors are taken into account, KnEA fails to converge retrieved results to real results, and the diversity of the solution space is monotonous. However, when transmitting frequency is considered, KnEA is significantly improved, and the solution space is evenly distributed on the PF. When two factors are considered simultaneously, the solution space converges, and the diversity of the decision space is enhanced. The results show that when the dimension of the decision space increases, the disturbance Figure 11. The comparison of simulated and retrieved profiles with 7 test algorithms when considering two factors. Subfigure (a) to (j) are the comparison diagram of simulated and retrieved profiles in GPS1, GPS2, GPS3, GPS4, and GPS5, respectively.

Conclusions
The traditional method of detecting atmospheric duct inversion is RFC technology. Based on previous work [15,16], this paper uses the phase delay and propagation loss of ground-based GPS to retrieve the atmospheric refractivity environment with a multi-objective algorithm. Given that different approaches have different search abilities, we divided these tests into three groups according to whether receiving antenna height and transmitting frequency were included as parameters. The performance of seven evolutionary algorithms was evaluated by HV, IGD and ∆ 2 metrics.
Based on the HV results, NSGA-II remains the top performer in the three test groups, followed by NSGA-III. MOEAD is the worst performer. However, the behavior of the HV and IGD (or ∆ 2 ) metrics may differ in some cases, particularly for MOEAD, HypE and GrEA. When no factors are taken into account, KnEA fails to converge retrieved results to real results, and the diversity of the solution space is monotonous. However, when transmitting frequency is considered, KnEA is significantly improved, and the solution space is evenly distributed on the PF. When two factors are considered simultaneously, the solution space converges, and the diversity of the decision space is enhanced. The results show that when the dimension of the decision space increases, the disturbance of the inverse problem is enhanced, and the difficulty of obtaining accurate values through multi-objective algorithms increases.
Comparing the inversion results with the simulation results, it was found that the inversion results are not closer to the real value when compared with other algorithms, even if the algorithm performs well in performance metrics, such as HypE. By contrast, MOEAD and NSGA-III can construct the atmospheric refractivity environment under various conditions, indicating that the two algorithms are more robust. Comparing the three groups of numerical experiments, it was found that when the solution space dimension increases, the multi-objective algorithms can still retrieve the real parameters by virtue of their excellent search ability. However, HypE cannot obtain a good profile in all three cases, suggesting that HypE may not be applicable to such problems.
The multi-objective evolutionary algorithm is used mainly to deal with DTLZ and WFZ problems, and the improved evolutionary algorithm will also be tested on these problems. However, after the computational experiments conducted by Li et al. [25], it was found that these algorithms do not always work well on all test problems. Joint inversion when using ground-based GPS signals is a multi-objective problem. For this problem, future research should aim to improve NSGA-II or NSGA-III to obtain better results. One solution is to search for the knee front instead of the whole PF.