An On-Site InSAR Terrain Imaging Method with Unmanned Aerial Vehicles

An on-site InSAR imaging method carried out with unmanned aerial vehicles (UAVs) is proposed to monitor terrain changes with high spatial resolution, short revisit time, and high flexibility. To survey and explore a specific area of interest in real time, a combination of a least-square phase unwrapping technique and a mean filter for removing speckles is effective in reconstructing the terrain profile. The proposed method is validated by simulations on three scenarios scaled down from the high-resolution digital elevation models of the US geological survey (USGS) 3D elevation program (3DEP) datasets. The efficacy of the proposed method and the efficiency in CPU time are validated by comparing with several state-of-the-art techniques.

The InSAR technique has been used for measuring surface topography and altimetry profile [6], mapping three-dimensional building shape [7], and detecting building edge [8].InSAR and TomoSAR imaging techniques demand precise coregistration between master image and slave images [9,10].In [9], a two-step, scale-invariant feature transform (SIFT) registration method was proposed.In [11], an outlier-detecting total least-squares (OD-TLS) algorithm was proposed to enhance the precision and robustness of 3D point-set registration.In [12], a sinc interpolation method was used to implement subpixel-to-subpixel match.
Faithful reconstruction of a terrain profile relies on accurate acquisition of interferometric phase.Numerous filtering methods on interferometric phase have been developed in the past few decades [13], including transform domain methods [14], nonlocal methods [15], and spatial domain methods [16].The trade-off between noise reduction and preservation of terrain-related signal with transform domain methods is typically adjusted via a threshold [14].
In [17], a 3D space-time nonlocal mean filter (NLMF) was applied to detect terrain changes by extracting nonlocal information from pixels in SAR images acquired in different time windows.In [18], a nonlocal mean filter was applied to a few persistent scattering points in a search area to improve the accuracy of 3D deformation profile.The nonlocal filters performed well in preserving details of complex structures, but were less effective in removing speckle noise [15].
A spatial-domain Gaussian filter was used to reduce high-frequency noise while preserving deformation information [19].It could reduce impulse noise and preserve edges by replacing each pixel with the mean value of its neighboring pixels [20], but the edges might become blurred due to loss of fine details.On the other hand, nonlocal filters preserve intricate details and adapt to local structures by considering pixel patch similarities, with the downside of computational complexity and sensitivity to parameters.
Phase unwrapping (PU) is a critical step to derive a faithful terrain profile from the interferometric phase of the acquired InSAR image, and the results are affected by the number of baselines used in probing the target area [21].A phase unwrapping problem could be formulated as a wrap count classification task to invoke deep learning methods [22], as used in processing optical images [23,24].In [25], a quality-guided algorithm was developed by unwrapping the phases along an optimal path in the interferometric phase image, based on a quality map of all edges in the image.Although the result is insensitive to noise, its performance relies on the quality map and the errors may propagate along the path.
A least-squares (LS) phase unwrapping method was formulated as a global optimization task [26], which may be sensitive to outliers and takes long computational time to process a large image.In [27], a phase unwrapping method was proposed by minimizing the difference between the discrete partial derivative of the wrapped phase function and that of the unwrapped phase function.The unwrapped phases were obtained by solving a Hunt's matrix and a discrete Poisson's equation, accelerated by using FFT, and the result was comparable to other methods.
InSAR imaging tasks have been operated on spaceborne [28], airborne [29], groundbased [30], and UAV-borne platforms [31].Spaceborne platforms are typically used to survey wide areas or large-scale phenomena [4], airborne platforms are more flexible in path planning [32], and ground-based platforms are used to monitor local environment [33].
UAV-borne platforms [34][35][36] are expedient for monitoring local area of contingency and can achieve spatial resolution of 10 cm [37] in P and L bands [31].For example, the Antarctic ice sheet (AIS) is covered with rifts and crevasses off the map, endangering the exploration personnel [38,39].Satellite-borne sensors cannot provide updated images and information for on-site tasks [38,40], but can be complemented with the InSAR images acquired with UAVs.Typical satellite-borne platforms take days to revisit the same area, with a baseline of a few hundred meters, while UAV-borne platforms can revisit the same area immediately after the previous flight, with a baseline of a few meters.
The radar signals can be acquired in two separate flights with single-channel SAR or a single flight with dual-channel SAR [41].Typical position accuracy of UAVs derived from GPS lies between 0.5 and 2 m [42], which can be enhanced to the centimeter level by using differential GPS (DGPS) technique [43] or real-time kinematic GPS [44].The downside of deploying UAVs is the impact of airflow disturbance and platform perturbation [42], which can be mitigated by applying motion compensation and autofocusing techniques [45][46][47].
In this work, an on-site InSAR imaging method is proposed to reconstruct a highresolution local terrain profile with UAV-borne SARs in L-band.A mean filter is used to reduce artifact speckles, and a least-squares phase unwrapping method is used to acquire 2D interferometric phase in almost real time.Three high-quality digital elevation models (DEMs) featuring volcano, glacier, and landslide, are retrieved from the US geological survey (USGS) 3D elevation program (3DEP) [48] to validate the efficacy of the proposed method.The performance is further evaluated by comparing the acquired InSAR images with their counterparts acquired using other state-of-the-art techniques under the effects of noise.
The rest of this paper is organized as follows: the proposed InSAR method is presented in Section 2, the simulation results are discussed in Section 3, and some conclusions are drawn in Section 4.

Proposed InSAR Method
Figure 1 shows the schematic of InSAR operation with two parallel flight paths, where the x, y, and z axes are aligned in the ground-range direction, azimuth direction, and height direction, respectively.A(x, y, z) denotes a point target, and the platform flies at height H above ground, with the side-looking angle θ ℓ to A(x, y, z).The coordinates of radar P 0 (η) along the master track and radar P 1 (η) along the slave track are given by The slant ranges from P 0 (η) and P 1 (η) to A(x, y, z) are R 0 (η) and R 1 (η), respectively, with

Backscattered Signals
Figure 2 shows the flow-chart of the range-Doppler algorithm (RDA) used in this work [49].The signal backscattered from the point target A(x, y, z) and received at P n (η) (n = 0, 1) is demodulated to the baseband as where A 0 is the amplitude, f 0 is the carrier frequency, K r is the chirp rate of the linear frequency modulation (LFM) pulse, τ is the range (fast) time, η is the azimuth (slow) time, and w e (t) = rect(t) is a window function, which is equal to one when |t| ≤ 1/2 and zero otherwise.
By taking the Fourier transform of s rn (τ, η) with respect to τ and η sequentially, we have where A is a constant of integration, W e ( f τ ) = w e ( f τ /K r ), and Flow-chart of range-Doppler algorithm (RDA) [49].

Range Compression
Let us define a range-compression filter H rc ( f τ , f η ), a coupling-compensation filter H cc ( f τ , f η ), and a range cell migration correction (RCMC) filter H rcmc ( f τ , f η ) as where f dc is the Doppler centroid and Then, multiply these three filters with S n ( f τ , f η ) to have where By taking the inverse Fourier transform of S n ( f τ , f η ) in the range, we obtain the rangecompressed signal where sinc(x) = sin(πx)/(πx).

Azimuth Compression
Let us define an azimuth compression filter which is multiplied with S n (τ, f η ) to have By taking the inverse Fourier transform of S n (τ, f η ) in azimuth, we obtain the azimuthcompressed signal which is the SAR image stored in a matrix s

Coregistration
Figure 3 shows the flow-chart of InSAR imaging.In the master image, the τ-axis is sampled at τ a + n r ∆τ, with τ a = 2R 0 /c and −N r /2 ≤ n r ≤ N r /2 − 1.These sampling values of τ are stored in a vector The slant ranges associated with all the range cells in the master image are r0 = c ā/2, and the side-looking angle of the vth range cell is Figure 4a shows that the point target A(x, y, h) appears at A 0 in the master image and A 1 in the slave image.If the platforms fly high enough, the range difference between the two tracks in Figure 4a can be approximated as that in Figure 4b, namely, ∆r . By the law of cosines, r 1 [v] can be represented as  Next, apply both sinc interpolation [12] and subpixel-to-subpixel match to coregister the slave image to the master image.The original slave image s(10) 1 of size N a × N r is interpolated in the range direction by a factor of 16 to obtain a finer slave image s(13) 1 of size N a × 16N r , which is resampled to derive a coregistered slave image S 1c [u, v] of size N a × N r .

Interferometry and Flat-Earth Phase Removal
An interferogram is formed from the master image S 0 [u, v] and the coregistered slave image S 1c [u, v] as where The interferometric phase attributed to the flat-earth reference plane is [50], which is subtracted from the phase of I[u, v] in (17) to obtain where ϕ (1)

Mean Filter
Since the master image and the slave image are not perfectly coregistered, the interferometric phase manifests some random noise, inflicting errors in the subsequent phase unwrapping process.A mean filter is applied before phase unwrapping to reduce such phase noise.
Consider a target area of (2L a + 1) azimuth cells by (2L r + 1) range cells, centered at [N a /2, N r /2].The interferometric phase in the target area is mapped from ϕ (1) [u, v] as Next, apply a searching window of size (2w a + 1) × (2w r + 1) and centered at The interferometric phase after mean filtering is computed as [20] ϕ

Poisson's Equation of Unwrapped Phase
Let us define a wrapping operator as [25] ϕ with possible outcomes of −2π, 0, or 2π.Next, take the mirror reflections of the wrapped phase function to obtain an even periodic function, which is continuous at the junction between two adjacent periods.Let U = 2L a + 1 and V = 2L r + 1, an expanded phase function is defined in terms of φ (4) and its three versions of mirror reflection as where The wrapped phase differences The least-squares solution of ( 25) and ( 26) can be obtained by minimizing the cost function [27,52] with the Hunt's method to have [52] φ which is rearranged into a Poisson's difference equation on a 2U × 2V grid as where

Nonlocal Filter
A nonlocal filter can be applied to either the interferometric phase ϕ (1) [u ′ , v ′ ] in (18) before phase unwrapping or ϕ (5) [u ′ , v ′ ] in (35) after phase unwrapping.The output of the nonlocal filter to ϕ (1) [u ′ , v ′ ] is computed as [15,53] where W se is a search window and W 1 [u, v; u ′ , v ′ ] is a weighting coefficient that is determined by the difference of pixels between two similarity windows centered at [u, v] and [u ′ , v ′ ].The weighting coefficient is large if the pixels in these two similarity windows match closely, and vice versa.The sum of all weighting coefficients over W se is set to one.
In the literature, a nonlocal filter is applied before phase unwrapping to reduce noise, speckle, or other artifacts embedded in the wrapped flattened phase, aiming to acquire a more accurate unwrapped phase.A nonlocal filter applied after phase unwrapping aims to smooth the unwrapped phase, at the risk of inducing artifacts or errors to the latter.The simulation results in this work show that smoother interferometric phase distribution is acquired by applying a nonlocal filter before phase unwrapping than after it.

Quality-Guided Phase Unwrapping
A quality-guided phase unwrapping process is also used in this work for comparison.A quality map is defined over a window W s centered at [u, v] as [25] Z where are the partial derivatives of the wrapped phase in the u and v directions, respectively, and their mean values over the window W s are denoted as After computing the quality map over an image area of interest, the pixel with the highest quality-map value is denoted as [u s , v s ].The phase unwrapping process begins with its four surrounding pixels, [u s ± 1, v s ] and [u s , v s ± 1], followed by the pixels surrounding them.The process is repeated until all the pixels in the image area are exhausted.

Target Height Estimation
By adding the flat-earth phases in the target area, back to the unwrapping phase, ϕ (5) [u ′ , v ′ ], we have Without loss of generality, choose cell [1,1] as the reference cell, with a reference phase f [1].The phase difference between the master image and the slave image is calibrated as Figure 5 shows the geometry for target-height estimation.The difference between |P 0 A| and |P 1 A| is estimated as The side-looking angle from the master track toward the point target A is calculated by using the law of cosines as Finally, the height of point target A is estimated as Figure 5. Geometry for target-height estimation.

Simulations and Discussions
In this section, three scenarios are simulated by using the DEM models extracted from the US Geological Survey (USGS) 3D Elevation Program (3DEP) dataset [48], including Mount St. Helens, Columbia Glacier, and Santa Cruz landslide.Without loss of effectiveness, each DEM model is scaled down by a common factor in all three dimensions to reduce the computational time.Table 1 lists the default InSAR parameters used in the simulations, from which the height of ambiguity is determined as [54]  Aside from the mean filter (MF) and the least-squares phase unwrapping (LSPU) method, the nonlocal filter (NF) and the quality-guided phase unwrapping (QGPU) method are also used for comparison.The effects of noise are studied by comparing the acquired images without noise with their counterparts at SNR = 0 dB, −5 dB, and −10 dB. Figure 6a,b shows the master SAR images without noise and at SNR = −10 dB, respectively.The latter manifests speckles over the whole image.Figure 6c,d shows the interferometric phase without noise and at SNR = −10 dB, respectively.The latter is severely smeared by noise and covered with speckles.Figure 6e,f shows the wrapped flattened phase without noise and at SNR = −10 dB, respectively.Similar features as in the interferograms are observed.
Figure 6g,h shows the coherence maps without noise and at SNR = −10 dB, respectively.The coherence between the master SAR image S 0 [u, v] and the coregistered slave image S 1c [u, v] is defined as [54] which is equal to one if the coregistration is perfect.It is observed that the coherence map without noise is close to one, and that, at SNR = −10 dB, it is slightly reduced to about 0.8. Figure 7 shows the reconstructed images of Mount St. Helens with the proposed method and the effects of noise.The comparison between mean filter (MF) and nonlocal filter (NF), as well as between least-squares phase unwrapping (LSPU) and quality-guided phase unwrapping (QGPU) methods, under noise free condition are also demonstrated.Figure 7a shows the true DEM of Mount St. Helens extracted from the dataset, Figure 7b shows the tenfold scale-down model of that in Figure 7a, and Figure 7c shows the reconstructed DEM with the proposed method.
The fidelity of the acquired InSAR image a against the true image b is evaluated with a structural similarity (SSIM) index defined as [55,56] where µ p and σ p are the mean and standard deviation, respectively, of image p, with p = a, b; σ ab is the covariance between images a and b; and c 1 and c 2 are stability constants.The SSIM index lies in [0, 1], with higher index indicating higher similarity.Each image pixel is stored in 8 bits, implying the dynamic range of L = 2 8 − 1 = 255.The stability constants are chosen as c 1 = (0.01L) 2 = 6.50 and c 2 = (0.03L) 2 = 58.52.The SSIM index between the images in Figure 7b,c is 0.90.The fidelity of the acquired InSAR image a against the true image b is also evaluated with a root-mean-square error (RMSE) defined as [57] RMSE(a, b) = where a p and b p are the values of the pth pixels in images a and b, respectively, and P is the number of pixels in one image.The RMSE between the images in Figure 7b,c is 5.79 m. Figure 7d shows the reconstructed DEM, with the NF replacing the mean filter; its SSIM index and RMSE against the image in Figure 7b are 0.89 and 6.14 m, respectively, i.e., slightly worse than the proposed method.
A closer inspection of the images in Figure 7c,d reveals that the NF preserves sharper edge while the MF smears image features.The SSIM indices and RMSE values of these two images are similar, implying that MF and NF have comparable performance.
Figure 7e shows the reconstructed DEM with MF and QGPU; its SSIM index and RMSE against the image in Figure 7b are 0.90 and 5.79 m, respectively, which are identical to those in Figure 7c, indicating that LSPU and QGPU methods have comparable performance in this case.Note that the QGPU method has longer computation time than the LSPU method.
Table 2 lists the CPU time of running for LSPU, QGPU, mean filter, and nonlocal filter, with MATLAB R2019a on a PC with i7-3.00GHz CPU and 32 GB memory.The CPU time of running for the mean filter is about half that of the nonlocal filter.The CPU time of the LSPU is much shorter than that of the QGPU because the former is implemented with FFT on the whole image, while the QGPU is executed pixel by pixel.The breakdown of CPU time in LSPU, QGPU, mean filter, and nonlocal filter, as well as their algorithms, are detailed in Appendix A.  Figure 8 shows the differences between the reconstructed DEMs in Figure 7c,f,g,h and the true DEM in Figure 7b.The difference is calculated as ∆z = |a p − b p |, where a p and b p are the values of the pth pixel in images a and b, respectively.Figure 8 shows that the difference is negligible at SNR≤ −5 dB and becomes significant at SNR= −10 dB.

Columbia Glacier
Figure 10 shows the images of the Columbia Glacier, located at (61.14 • N, 147.08 • W) on the south coast of Alaska, USA.The DEM is extracted from the USGS 3DEP dataset [48], with spatial resolution of 5 m × 5 m. Figure 10a shows the true DEM of the Columbia Glacier extracted from the dataset.Figure 10b shows the fivefold scale-down model of that in Figure 10a.Figure 10c shows the reconstructed DEM with the proposed method and the simulation parameters listed in Table 1.The reconstructed DEM closely matches the true DEM; its SSIM index and RMSE against the image in Figure 10b    Figure 10d shows the reconstructed DEM with NF replacing the mean filter; its SSIM index and RMSE against the image in Figure 10b are 0.87 and 28.24 m, respectively.Figure 10e shows the reconstructed DEM with QGPU replacing LSPU; its SSIM index and RMSE against the image in Figure 10b are 0.88 and 24.91 m, respectively, slightly better than their counterparts in Figure 10c.The glacier in this scenario manifests a steeper slope than that of the volcano in the previous scenario.The use of mean filter may blur some fine features in the DEM; hence, it should be used with caution if the terrain profile changes drastically.Figure 11 shows the difference between the reconstructed DEM in Figure 10c,f-h, and the true DEM in Figure 10b.As SNR is decreased from 0 dB to −10 dB, more pixels in the layover regions manifest significant difference.

Santa Cruz Landslide
Figure 12 shows the images of an area with potential landslide hazards near Santa Cruz (37.03 • N, 122.12 • W), California, USA, on 17 March 2020, which are extracted from the USGS 3DEP dataset [48], with spatial resolution of 3 m × 3 m. Figure 12a shows the true DEM of the target area, and Figure 12b shows the tenfold scale-down model of that in Figure 12a.Figure 12c shows the reconstructed InSAR image with the proposed method.The reconstructed DEM closely matches the true DEM; its SSIM index and RMSE against the image in Figure 12b are 0.90 and 2.32 m, respectively.Figure 12d shows the reconstructed InSAR image, with the nonlocal filter replacing the mean filter.Its SSIM index and RMSE against the image in Figure 12b are 0.89 and 2.46 m, respectively.Figure 12e shows the reconstructed DEM, with QGPU replacing LSPU.Its SSIM index and RMSE against the DEM in Figure 12b are 0.90 and 2.32 m, respectively, same as those for the proposed method.Figure 13 shows the differences between the reconstructed DEM in Figure 12c,f-h and the true DEM in Figure 12b.As SNR is decreased, more pixels manifest significant difference.with different combinations of the filter and phase unwrapping methods under noise-free condition.The best indices among the three different methods are marked by boldface, and the differences among these combinations are not significant.and 12, by using the proposed method under different SNRs.In general, the best indices occur at SNR = 0 dB, but some indices at SNR = −5 dB turn out to be slightly better.Next, we reconstruct two DEMs over the same area, dated 17 March 2020 and 10 August 2022, and show their height difference in Figure 14 to detect possible landslide hazards.Figure 14a shows the height difference between the two true DEMs extracted from the dataset on the two dates just mentioned [48].Figure 14b shows the height difference between the two InSAR images reconstructed with the proposed method, and its SSIM index against the image in Figure 14a is 0.29.Both images show similar patterns, but some fine features in Figure 14a are smeared out in Figure 14b.
Figure 14c shows the height difference between the two images reconstructed with the nonlocal filter replacing the mean filter.The image shows a similar pattern as in Figure 14a, with more fragmented features than the latter.The SSIM index between these two images is 0.30.
Figure 14d shows the reconstructed image, with the QGPU replacing the LSPU.It is more resemblant of Figure 14b than Figure 14c, and its SSIM index against the image in Figure 14a is 0.29.By comparing Figure 14a-d, the combination of the NF and LSPU methods seems to manifest more terrain details in the true DEM.
Figure 14e-g shows the height differences acquired with the NF and LSPU at SNR = 0 dB, −5 dB, and −10 dB, respectively.Their SSIM indices against Figure 14a   Figure 15 shows the density maps of high-risk landslide areas acquired with the three methods compared in this section.The areas with height difference greater than ±1 m are highlighted with red marks (z ≥ 1 m) and blue marks (z ≤ −1 m).
The density maps in Figure 15b,d appear similar, consistent with the performance indices of these two methods.On the other hand, Figure 15c manifests an excessive number of high-risk marks.

Comparison with State-of-the-Art Techniques
In [58], a satellite-based InSAR method utilizing a Kalman filter (KF) and sequential least squares (SLS) was introduced to implement near-real-time applications.The SLS was designed to reduce the CPU time of conventional LS methods by sequentially processing the whole image.For comparison, the results in Figures 7, 10, 12, 14 and 15 demonstrate the efficacy of the LSPU method, which incorporates 2D FFT in the LS method to reduce the CPU time even more significantly.
In [59], a deep learning-based LSPU method utilizing encoder-decoder architecture (PGENet) was proposed to reconstruct the wrapped phase data embedding noise.Similarly, a deep learning-based QGPU via global attention U-Net was introduced in [60].The efficacy of LSPU and QGPU can be enhanced by utilizing a deep learning approach.Furthermore, the results in [59] demonstrated that LSPU outperformed QGPU, producing lower RMSE and shorter computational time, especially in low-coherence areas.The results in Figures 7c,e and 12c,e show that the LSPU has nearly the same performance as the QGPU, not to mention that the LSPU has high computational efficiency, as listed in Table 2.
In [61], a weighted least-squares (WLS) technique was proposed to improve the effectiveness of phase unwrapping within a small baseline InSAR framework.Choosing a small baseline in a satellite-based InSAR approach can reduce the computational cost.The proposed UAV-based InSAR approach has relatively smaller (temporal and spatial) baseline compared to the satellite-based counterpart in [61].In addition, the UAV-based platform offers more flexibility in achieving specific baseline and revisit time.
In [15], the low-coherence area and high-coherence area were filtered by a local fringe frequency compensation nonlocal filter and Goldstein filter, respectively.The Goldstein filter, considered an old-fashioned method, was used for its computational efficiency [15].For the same reason, the mean filter adopted in our work is suitable for relatively smooth and high-coherence areas.In our approach, the data can be acquired with two UAVs (sensors) in a single flight or with one UAV (sensor) in two separate flights that are staggered by a short revisit time.The coherence in the UAV-based InSAR image pair is higher than that in the satellite-based counterpart, which has typical revisit time of 12 days or longer.
In [53], various filters were simulated upon ramp and square noisy images.The results indicated that the nonlocal filter outperformed both the Lee filter and the Goldstein filter (considered old-fashioned filters) on square noisy images, but underperformed the latter on ramp noisy images [53].Such outcomes are consistent with the simulation findings presented in Sections 3.1-3.3.The scenarios simulated in Sections 3.1 and 3.3 manifest relatively smooth height profiles, resembling ramp noisy images.Figures 7c,d and 12c,d show that the mean filter achieves lower RMSE and higher SSIM value in these two scenarios.On the other hand, the scenario simulated in Section 3.2 manifests a steep mountain terrain, resembling square noisy images.Figure 10c,d show that nonlocal filter achieves lower RMSE in this scenario.
In the presence of additive Gaussian noise, the pivoting mean filter emerges as statistically optimal from the perspective of maximum likelihood estimation [20].As for the scenarios with relatively smooth profile discussed in Sections 3.1 and 3.3, reconstruction with mean filter (MF) results in slightly higher SSIM value and lower RMSE value compared with the nonlocal filter (NF).However, the mean filter may oversmooth the phase details in areas with drastic topographical variations.As discussed in Section 3.2, the scenario containing some steep areas may not be well reconstructed by using the mean filter, and the nonlocal filter achieves a lower RMSE on the reconstructed DEM.
In [62], a coherence-guided InSAR phase unwrapping method was proposed in conjunction with cycle-consistent adversarial networks.The coherence-guided phase unwrapping method typically employs a cost function in terms of phase gradients and coherence values to penalize phase discontinuities in low-coherence regions and promote smooth phase paths in high-coherence areas.The method could achieve accurate phase unwrapping with low RMS value.However, the generative adversarial networks entail high computational cost and require extensive training data.
In [63], a median filter was cascaded with a mean filter based on stationary wavelet transform for phase filtering.The median filter exceled in preserving phase fringes, while the mean filter demonstrated superior noise reduction capabilities.
Lightweight UAVs are typically more susceptible to wind disturbances than airborne platforms in conducting SAR or InSAR imaging tasks.Both types of platform may tilt or dip under headwinds and deviate from planned flight path under crosswinds [64].Take a real-world example of dispatching a small UAV for InSAR imaging.It can carry a payload up to 7.5 kg and stay in the air for an hour while equipped with GPS navigation gear.Its attitude response to the wind interference can be ignored if the wind speed if less than 5 mph, and its trajectory deviation can be compensated with servo mechanisms and algorithms.

Discussions on Contributions and Constraints
The contributions of this work are summarized as follows: 1.
An on-site InSAR imaging method is proposed for monitoring environmental changes.The imaging task is carried out with UAVs, which can be swiftly deployed on site with small decorrelation between master and slave images; 2.
High-resolution DEMs are reconstructed and enhanced with a mean filter to mitigate artifacts on InSAR images, which are attributed to imperfect coregistration between master and slave images.A least-squares phase unwrapping method at extremely low computational cost is applied to run the imaging task near real-time; 3.
Three scenarios of DEM reconstruction are simulated to validate the efficacy of the proposed approach, considering the effect of noise.The fidelity of acquired InSAR images is evaluated in terms of SSIM index and RMSE.The merits of using mean filter and least-squares phase unwrapping method are compared with two popular counterparts.
We propose a feasible scheme of deploying UAVs for on-site InSAR imaging of small areas, which cannot be achieved with satellite-borne InSAR platforms.Potential applications include monitoring natural disasters such as landslides, wildfires, and volcanic eruptions.In these scenarios, the satellite-borne InSAR imaging technique is limited by the long revisit time of days, which is impractical for real-time monitoring of disaster evolution.Among many state-of-the-art algorithms, choosing the mean filter and the leastsquare phase unwrapping method via Poisson's difference equation and FFT can practically accomplish real-time imaging tasks in terms of robustness and computational efficiency.
The attitude of an airborne platform can be disturbed by complicated airflow disturbance and platform mechanical oscillation.Their effects on SAR imaging have been compensated with a compressive-sensing technique [46].

Conclusions
An on-site UAV-borne InSAR imaging method is proposed to reconstruct terrain profile with high spatial resolution in real time.A UAV-borne imaging system can be swiftly deployed to monitor rapidly changing environments during extreme weather events or natural disasters.Three different high-resolution DEMs are extracted from the USGS 3DEP datasets to validate the efficacy of the proposed approach.The combination of the least-squares phase unwrapping method featuring short CPU time and the mean filter for mitigating speckles on the acquired InSAR image is effective for monitoring terrain profile in real time.Several state-of-the-art techniques like nonlocal filter and quality-guided phase unwrapping method have been used to validate the advantages of the proposed approach to acquire InSAR images for reconstructing terrain profile in real time.
size U × V.The average CPU time to run a complete cycle of mean filter on one pixel is about 0.005 s, and its counterpart with nonlocal filter is about 0.013 s.

Figure 4 .
(a) Point target A(x, y, h) appears at A 0 in the master image and A 1 in the slave image.(b) Point target A g (x, y, 0) with known r 0 [v], r 1 [v], θ ℓ [v], and b.

3. 1 .Figure 6 .
Figure 6 shows the intermediate images of Mount St. Helens, scaled down tenfold to reduce the computational time.Mount St. Helens is an active volcano located at (46.2 • N, 122.18 • W), Skamania County, Washington, USA.Its elevation is 2549 m and its prominence is 1404 m.The DEM is extracted from the USGS 3DEP dataset [48], with spatial resolution of 1 m × 1 m.

Figure
Figure 7f-h shows the InSAR images acquired with the proposed method at SNR = 0 dB, −5 dB, and −10 dB, respectively.Their SSIM indices against Figure 7b are 0.89, 0.89, and 0.74, respectively, and their RMSE values against Figure 7b are 10.8 m, 9.22 m, and 22.38 m, respectively.The main features in the image are almost unaffected at SNR = −5 dB and become slightly distorted at SNR = −10 dB.In short, the DEM of Mount St. Helens is reconstructed with high fidelity by visual inspection, as well as in terms of SSIM and RMSE, even at SNR = −10 dB.Figure8shows the differences between the reconstructed DEMs in Figure7c,f,g,h and the true DEM in Figure7b.The difference is calculated as ∆z = |a p − b p |, where a p and b p are the values of the pth pixel in images a and b, respectively.Figure8shows that the difference is negligible at SNR≤ −5 dB and becomes significant at SNR= −10 dB.

Figure 9 Figure 9 .
Figure9shows the reconstructed images of Mount St. Helens, with the nonlocal filter (NF) applied before and after the LSPU, under noise-free condition.The computational Figure10shows the images of the Columbia Glacier, located at (61.14 • N, 147.08 • W) on the south coast of Alaska, USA.The DEM is extracted from the USGS 3DEP dataset[48], with spatial resolution of 5 m × 5 m.Figure10ashows the true DEM of the Columbia Glacier extracted from the dataset.Figure10bshows the fivefold scale-down model of that in Figure10a.Figure10cshows the reconstructed DEM with the proposed method and the simulation parameters listed in Table1.The reconstructed DEM closely matches the true DEM; its SSIM index and RMSE against the image in Figure10bare 0.88 and 28.4 m, respectively.The backscattered signals from multiple resolution cells near the steep mountain slope region surrounding the glacier, enclosed by white dashed curves in Figure10b, are mapped to the same resolution cell in the acquired image, inflicting layover effect.The high RMSE value is attributed to such layover regions, which is confirmed later in Figure11.
Figure10dshows the reconstructed DEM with NF replacing the mean filter; its SSIM index and RMSE against the image in Figure10bare 0.87 and 28.24 m, respectively.Figure10eshows the reconstructed DEM with QGPU replacing LSPU; its SSIM index and RMSE against the image in Figure10bare 0.88 and 24.91 m, respectively, slightly better than their counterparts in Figure10c.The glacier in this scenario manifests a steeper slope than that of the volcano in the previous scenario.The use of mean filter may blur some fine features in the DEM; hence, it should be used with caution if the terrain profile changes drastically.Figure10f-h shows the InSAR images acquired with the proposed method at SNR = 0 dB, −5 dB, and −10 dB, respectively.Their SSIM indices against Figure10bare 0.87, 0.86, and 0.78, respectively, and their RMSE values against Figure 10b are 31.93m, 30.24 m, and 33.24 m, respectively.The acquired InSAR images at SNR = 0 dB and SNR = −5 dB have similar SSIM indices, and the RMSE at SNR = −5 dB is slightly lower than the other two images.Figure11shows the difference between the reconstructed DEM in Figure10c,f-h, and the true DEM in Figure10b.As SNR is decreased from 0 dB to −10 dB, more pixels in the layover regions manifest significant difference.
Figure12cshows the reconstructed InSAR image with the proposed method.The reconstructed DEM closely matches the true DEM; its SSIM index and RMSE against the image in Figure12bare 0.90 and 2.32 m, respectively.Figure12dshows the reconstructed InSAR image, with the nonlocal filter replacing the mean filter.Its SSIM index and RMSE against the image in Figure12bare 0.89 and 2.46 m, respectively.Figure12eshows the reconstructed DEM, with QGPU replacing LSPU.Its SSIM index and RMSE against the DEM in Figure12bare 0.90 and 2.32 m, respectively, same as those for the proposed method.Figure 12f-h show the InSAR images acquired with the proposed method at SNR = 0 dB, −5 dB, and −10 dB, respectively.Their SSIM indices against Figure 12b are 0.90, 0.72, and 0.66, respectively, and their RMSE values against Figure 12b are 2.32 m, 7.74 m, and 9.09 m, respectively.Figure13shows the differences between the reconstructed DEM in Figure12c,f-h and the true DEM in Figure12b.As SNR is decreased, more pixels manifest significant difference.

Figure 13 .
Difference between reconstructed DEM and true DEM in Figure 12b: (a) noise-free, (b) SNR = 0 dB, (c) SNR = −5 dB, (d) SNR = −10 dB. , are 0.45, 0.20, and 0.13, respectively, and their RMSE values against Figure 14a are 4.31 m, 4.63 m, and 9.54 m, respectively.The images in Figure 14e,f still retain some useful information about terrain profile change, but that in Figure 14g provides no useful clue.

Table 1 .
Default parameters for InSAR simulations.

Table 2 .
CPU time of running for LSPU, QGPU, mean filter, and nonlocal filter.

Table 3 .
RMSE and SSIM indices of acquired images under noise-free condition.

Table 4
summarizes the RMSE and SSIM indices of images in Figures 7, 10

Table 4 .
RMSE and SSIM indices of acquired images with proposed method under different SNRs.
do Define the searching window, W 1 [u, v; u ′ , v ′ ], centered at [u ′ , v ′ ] if The searching window exceeds the border of target area then Cut off the exceeding part end else Apply the searching window with size of L se × L se end for 1 ≤ u ≤ L se do for 1 ≤ v ≤ L se do if The searching window exceeds the border of target area then Cut off the exceeding part end else Apply the similarity window with size ofL si × L si end Calculate W 1 [u, v; u ′ , v ′ ] end end Find the maximum of W 1 [u, v; u ′ , v ′ ] ∈ W se Calculate ϕ