An Improved Stanford Method for Persistent Scatterers Applied to 3D Building Reconstruction and Monitoring

Persistent scatterers interferometric Synthetic Aperture Radar (PS-InSAR) is capable of precise topography measurement up to sub-meter scale and monitoring subtle deformation up to mm/year scale for all the radar image pixels with stable radiometric characteristics. As a representative PS-InSAR method, the Stanford Method for Persistent Scatterers (StaMPS) is widely used due to its high density of PS points for both rural and urban areas. However, when it comes to layover regions, which usually happens in urban areas, the StaMPS is limited locally. Moreover, the measurement points are greatly reduced due to the removal of adjacent PS pixels. In this paper, an improved StaMPS method, called IStaMPS, is proposed. The PS pixels are selected with high density by the improved PS selection strategy. Moreover, the topography information not provided in StaMPS can be accurately measured in IStaMPS. Based on the data acquired by TerraSAR-X/TanDEM-X over the Terminal 3 E (T3 E) site of Beijing capital international airport and the Chaobai River of Beijing Shunyi District, a comparison between StaMPS-retrieved results and IStaMPS-retrieved ones is performed, which demonstrates that the density of PS points detected by IStaMPS is increased by about 1.8 and 1.6 times for these two areas respectively. Through comparisons of local statistical results of topography estimation and mean deformation rate, the improvement granted by the proposed IStaMPS is demonstrated for both urban areas with complex buildings or man-made targets and non-urban areas with natural targets. In terms of the spatio-temporal deformation variation, the northwest region of T3 E experienced an exceptional uplift during the period from Jun. 2012 to Aug. 2015, and the maximum uplift rate is approximately 4.2 mm per year.


Introduction
Space-based technology is an efficient way to reconstruct 3D topography and monitor both natural and man-made disasters, such as earthquakes [1][2][3], volcanoes [4][5][6][7], landslides [8][9][10][11], and anthropogenic subsidence or uplift due to fluid withdrawal or injection [12][13][14][15]. One representative example is the spaceborne Synthetic Aperture Radar (SAR), which has a high spatial-temporal resolution and a broad spatial coverage, capable of precise topography measurement and deformation monitoring using advanced multi-temporal interferometric SAR (InSAR) techniques [16], such as persistent scatterer InSAR (PS-InSAR) [17][18][19][20], SBAS [21,22], SqueeSAR [23], CAESAR [24,25], PD-PSInSAR [26], and TomoSAR [27][28][29]. With its capability of measuring topography up to sub-meter scale and monitoring subtle deformation up to mm/year scale, PS-InSAR has been widely employed where there are multiple types of scattering characteristics, including man-made and natural targets. It is shown that the density of PS points detected by IStaMPS is increased by about 1.8 and 1.6 times for these two areas, respectively. The enhanced performance of IStaMPS was illustrated by each step of selected points using the T3 E site as an example. Through comparisons of local statistical results of topography estimation and mean deformation rate, the improvement granted by the proposed IStaMPS was demonstrated quantitatively. Finally, the spatiotemporal deformation variations of T3 E during the period from June 2012 to August 2015 were investigated.

Methodology
Suppose there are M + 1 single-look complex (SLC) SAR images, and then M interferograms are generated with respect to a common master image. For the stack of interferograms, the key procedures of StaMPS and IStaMPS are shown in Figure 1, which are marked with red dotted box and red solid box, respectively. The main difference of PS identification is the selection of initial PSCs and the removal of unstable adjacent PS pixels. Initially, a subset of pixels is selected based on analysis of the mean amplitude (MA) and the amplitude dispersion index (ADI) of SAR stack. Then, the reason the initial set of PSCs is more suitable than that of StaMPS for the PS probabilistic model, is explained in detail. Moreover, an optional selection is included to reject adjacent pixels with low temporal coherence. The difference is detailed in the following subsections. In addition, to accurately retrieve both topography and deformation, there are several aspects that differ from StaMPS in the procedure of results retrieving. Firstly, the spatial-correlation residual topography phase is kept before 3D phase unwrapping. Secondly, the residual topography and linear deformation are roughly estimated and subtracted. This step ensures accurate estimation of the atmosphere and orbit phase. Finally, to remove the noise part of deformation phase, the deformation time series are obtained by spatially and temporally lowpass filtering.

Selection of PS Pixels Based on ADI and MA
When performing the phase stability analysis based on spatial correlation in StaMPS, the normalizing weighting factors of neighboring pixels are typically set with the SNR associated with brightness. In fact, the brightness of PS pixel is typically less than that of layover pixels. If layover pixels within the superposition of multiple strong scatterers are not excluded, the coherence of the PS near the layover pixels may not be exactly calculated based on spatial correlation. As illustrated in Figure 2, pixel P 1 associated with the resolution cell in red is mainly overlaid by the two strong dominant scatterers of building eave and tree crown. If the pixel P 1 is not excluded, the coherence calculation of PS pixel P 2 would be influenced due to the large weight of pixel P 1 . As a consequence, the coherence of true PS points may decrease and the coherence of non-PS points may increase. . Geometric diagram of two resolution cells: P 1 represents a layover pixel with two strong point-like scatterers' superposition and P 2 represents a PS pixel within only one scatterer from building façade.
When it comes to the PS pixels selection based on spatial correlation, the population of PSCs pixels is treated as the union of two populations, population X containing only PS pixels and population Y containing only non-PS pixels. Then, the probability density model p(γ P ) associated with the coherence γ P of pixel P is a weighted sum of the probability density p X (γ P ) for the PS pixels and the probability density p Y (γ P ) for the non-PS pixels, i.e., where β is the weighting factor with 0 ≤ β ≤ 1. By binning and normalizing the values of γ P , p(γ P ) can be obtained through histogram, while p Y (γ P ) can be derived using a large number (10 6 used here) of simulated pseudo-pixels with random noise phase to model the noise term, and then an adaptive coherence threshold on γ P can be derived to identify the PS pixels. Due to adaptivity of the coherence threshold obtained based on the PS probabilistic model, it can effectively select the PS pixels, especially for the areas with few man-made targets. However, the probabilistic model in Equation (1) may be limited for complex areas, where there exist some special layover pixels within multiple strong dominant scatterers, i.e., the pixel P 1 in Figure 2. Due to the partial correlation of layover pixels, the simulated random noise phase population would not exactly contain this kind of layover population. Then, p Y (γ P ) simulated by random noise phase is not accurate enough to model the population Y of initial PSCs, which contains the special layover pixels. As a result, the adaptive threshold derived based on the probabilistic model in Equation (1) would not be exact in selecting stable PS. Some layover pixels would be misidentified as PS, while many stable PS near the layover pixels would be dropped from the PSCs.
Accordingly, some special layover pixels within multiple strong scatterers must be removed before coherence calculation and PS pixels selection based on spatial correlation. Since there is a statistical relationship between amplitude stability and phase stability [18], the initial pixels for phase stability analysis are selected based on the statistics of their amplitudes. In this work, the non-PS pixels within an area of zero scatterer and PS pixels within an area of a single dominant scatterer are selected by thresholding on ADI and MA. The MA µ A,P and ADI D A,P of P pixel are defined as [18,43] where S m,P is the complex value of the mth observation at pixel P, and | · | is the absolute value operator. They measures the scattering brightness of a pixel and the standard deviation (std) from a series of amplitude values, respectively. Since the layover pixels have a high brightness level [43,44], thresholds on µ A,P and D A,P ensure that most of layover pixels and non-PS pixels are excluded. Then, the initial set of PSCs ensures the correctness of coherence γ P estimated based on spatial correlation. Moreover, the initial PSCs are guaranteed to be suitable for the probabilistic model to further select the reliable PS. Typically, the threshold value on D A,P is predefined in the region of 0.4, and the threshold value on µ A,P is predefined to be above µ A,85% , where µ A,85% means the first 85% of pixels with their mean amplitudes sorted from smallest to largest are selected.

Removal of Unstable Adjacent PS Pixels
In StaMPS, to ensure the accuracy of 3D phase unwrapping, adjacent pixels of PS are entirely dropped from PSCs due to the assumption that the dominant scatterer in PS pixel appears in all other adjacent pixels. The assumption is too harsh for scenes with many extended targets.
Here, the stable adjacent PS pixels are identified based on the phase stability analysis of temporal coherence. Different from conventional temporal coherence [18,19,31], here the deformation for temporal evolution is not modeled, which is motivated from the 3D phase unwrapping theory [45].
Mathematically, the wrapped phase, W{∆φ P,i }, of the pixel P of the ith differential interferogram can be written as the wrapped sum of five terms [35] where ∆φ H,P,i = 4πb ⊥i ∆h λr sin θ is the residual topographic phase due to the coarse DEM; φ D,P,i is the phase change due to movement of the scatterer along the satellite Line of Sight (LOS) direction; φ A,P,i is the phase due to the difference in atmospheric delay between passes; φ O,P,i is the residual phase due to satellite orbit inaccuracies; φ N,P,i is a noise term due to the scattering variability, thermal noise, co-registration errors and position uncertainty of the phase center in azimuth; and W{·} denotes the wrapping operator.
Without assuming a particular deformation model for temporal evolution, each PSC phase is filtered by a low-pass temporal operator. After subtracting the resulting filtered phase value ∆φ tl P,i from ∆φ P,i , the rewrapping phase gives where φ th denotes the temporally uncorrelated part of φ. φ th D,P,i is very small and can be considered as zero. For spatial phase unwrapping, the Delaunay triangulation network is used to connect the sparse PSCs via spatial arcs [17]. Since the temporal high-pass part of φ N,P,i does not change, the double-difference wrapped phase of the P 1 -P 2 arc after removal of temporal low-pass filtering is given by where ∆ P 2 P 1 represents the spatial difference operator between pixels P 1 and P 2 . Due to the spatial correlation of φ A,P,i and φ O,P,i , the spatial difference operator of adjacent pixels mitigates the effect of atmosphere and orbit, and thus ∆ P 2 Due to the spatial and temporal correlation of φ D,P,i , is very small and can be considered as zero. Afterwards, the temporal high-pass part of differential topography ∆ P 2 P 1 ∆h th p is estimated by maximizing the following temporal coherence measurement Then, the noise level and the temporal coherence of the P 1 -P 2 arc are obtained. Furthermore, the noise level of each PSC is estimated by the integration of differential noise of arcs. Finally, the temporal coherence of each PSC is obtained. The pixel whose coherence exceeds a fixed threshold (typically, 0.6-0.7), is selected to be the most likely PS. After dropping the pixels with low temporal coherence, a new Delaunay triangulation network is formed andφ N,P,i is reestimated. Several iterations (3-5 loops) are implemented to refine the selection of PS pixels.

3D Phase Unwrapping
Once the PS pixels are selected, the original wrapped interferogram phase must be unwrapped firstly. The 3D phase unwrapping method [45], which can effectively mitigate the phase jump of PS data in spatial and temporal dimensions, is adopted. There are four steps in its implementation. Firstly, resample the wrapped phase to grids and spatially low-pass filter the phase. Secondly, unwrap the phase difference of nearby grids temporally. Thirdly, unwrap phase spatially using maximum a posteriori (MAP) cost functions. Finally, interpolate 3D PS phase from unwrapped gridded interferograms. The results of unwrapping can be summarized as

Coarse Estimation of Topography Error and Deformation Rate
The temporal difference of unwrapped phase in Equation (8) is given by where φ D,P,i can be decomposed into linear and nonlinear components φ DM,P,i and φ DN,P,i [19]. The linear component is the dominant part for deformation and modeled by φ DM,P, where v is the mean deformation rate. The residual height ∆ĥ and linear deformation ratev are roughly estimated by the least square method, and then ∆φ H,P,i andφ DM,P,i are obtained.

Orbit Phase Estimation
After removal of the retrieved topographic phase and linear deformation phase, the unwrapped phase is written by In fact, the orbit phase of each interferogram can be linearly modeled associated with the spatial position [19,46], that is, where (P x , P y ) denotes the SAR coordinate position of P, and a i , b i and c i are the linear model parameters. Since the nonlinear deformation and atmosphere phase are small and not linearly modeled by Equation (11), a i , b i and c i are retrieved by 2D linear regression analysis, and thenφ O,P,i is obtained.

Atmospheric Phase Estimation
After removal of the retrieved topographic and mean deformation phase as well as orbit phase, the unwrapped phase becomes Based on the different spatial and temporal behavior of atmospheric phase and nonlinear deformation phase, the atmospheric phase can be separated from deformation through temporal band-pass filtering [17]. Firstly, a temporally low-pass filter in time L tl i is applied to the phase difference between neighboring PS pixels. Then, the separated high-pass differential phase is used to estimate atmospheric phase φ A,P,i bŷ where residual noise is suppressed during the least square inversion ∆ P 2 P 1 −1 of differential atmospheric phase.

Topography and Deformation Time Series
Finally, after removal of atmospheric and orbit phase, an iterative regression analysis is performed to improve the model parameters including the residual topography, the linear deformation rate and the deformation time series. Mathematically, the calibrated phase is given by According to Equation (9), topography error and deformation rate are estimated. Moreover, the deformation time series is obtained bŷ where spatially and temporally low-pass filters in [35] are adopted to reduce the noise phase.

Study Area and Dataset Used
Shunyi District is located at the northeast of Beijing, China. There are multiple types of man-made and natural targets for urban and non-urban areas study, such as the Beijing Capital International Airport and the Chaobai River. As shown in Figure 3, there are three terminals (T1, T2, and T3) and three runways numbered 36L, 36R, and 01 at the airport. T3 includes three buildings, T3 C, T3 D and T3 E. T3 and runway 01 were built in 2008 to cope with the extra traffic brought by the 2008 Olympic visitors. The Chaobai River is surrounded by several villages, and crossed by Baima Road and near Youdi Road. To assess the performance of the proposed IStaMPS, the T3 E and Chaobai River sites were selected as the study areas, which are marked with red box in Figure 3. On the one hand, there are multiple types of scattering characteristics over the T3 E site, including complex building and runways, which is adequate to assess the superiority of IStaMPS over StaMPS for urban areas. Due to the layover effect taking place in the building, this is a relatively challenging area for both algorithms. One SAR intensity image observed in this area is shown at the top-left side of Figure 3. The roof of the T3 E building has a streamlined shape. On the other hand, there are many bare soils and rocks located on both sides of the river. They are adequate to validate the reliability of IStaMPS for non-urban areas.
The dataset used in this study consists of a stack of 25 TerraSAR-X/TanDEM-X images, covering ground areas of 1.3 km × 1 km and 4.6 km × 3.6 km, across which the T3 E building and the Chaobai River are located, respectively. The stack of 25 TerraSAR-X/TanDEM-X images were acquired from June 2012 to August 2015 from the ascending track provided by the German Aerospace Center. The TerraSAR-X image acquired on 10 October 2013 was selected as the master image and the remaining images were jointly co-registered with it. The maximum temporal baseline is about 3.4 years, and the perpendicular baseline ranges from −182 m to 312 m. The temporal/perpendicular baseline plot for the SAR dataset is shown in Figure 4. The stack was imaged in Stripmap mode with a resolution up to 3 m at HH single polarization with a mean incidence angle of 34.74 • and 35.37 • . Moreover, the one arc-second SRTM DEM associated with the studied areas was used to derive the referred topographic phase. For example, the left side of Figure 3 lists the referred DEM associated with the T3 E site. After removing the derived phase from interferograms, series of differential interferograms were generated. It is worth noting that the bias of residual topography is very large for the T3 E building. Thus, a 3D reconstruction was necessary for investigation.

Experimental Results and Discussions
The same preprocessing method was used to generate interferograms of StaMPS and IStaMPS for a fair comparison. Due to lack of accurate topography measurement in StaMPS, the estimation method presented in Section 2.3 was used to retrieve the topography and deformation information for both the studied urban and non-urban areas. All results are shown in SAR coordinate.

PS Pixels Selection Results
Taking the T3 E site as an example, the detailed selection process for PS pixels by the IStaMPS method is described as follows. According to the distribution histograms of D A and µ A of the dataset shown in Figure 5, a subset of pixels most likely to be PS was selected as PSCs by thresholding the maximum value of D A and µ A at 0.4 and 5000, respectively. This step ensured that most pixels with a single-scatterer stable phase were included and most pixels overlaid by multiple strong scatterers were excluded from the initial PSCs. Secondly, a linear model for spatial coherence threshold associated with D A was derived from the initial PSCs based on StaMPS, as shown in Figure 6a. Then, the PSCs were updated by rejecting those pixels less than the adaptive coherence. Finally, the updated PSCs were connected with the Delaunay triangulation network, as shown in Figure 6b. The pixels with temporal coherence below 0.65 were dropped according to the process described in Section 2.2.
In total, 3183 PS pixels (1153 pixels/km 2 ) and 8980 PS pixels (3253 pixels/km 2 ) over the T3 E site were detected by StaMPS and IStaMPS, respectively. The density of the reliable measurement points detected by IStaMPS is about 2.8 times that detected by StaMPS. Subsequently, a comparison of measurement points of each step identified by the two methods is presented in Figure 7, where the three-step PSCs results are shown in the top row and bottom row, respectively. For the first step of amplitude analysis, there were 47,095 PSCs initially selected by StaMPS, while 46,554 PSCs were selected by applying IStaMPS, where the smaller number of IStaMPS was due to its extra constraint of µ A . For the second step, IStaMPS provided very good coverage over most parts of the man-made structures such as roofs and eaves of the T3E building, while StaMPS could not provide detailed information of T3 E eaves in the L region marked in Figure 5. Furthermore, the PSCs detection ratio, which is the number of latter PSCs over that of previous PSCs, was 0.383 for IStaMPS, higher than the value of 0.380 by StaMPS. The reason may be that the probabilistic model in Equation (1) is more suitable for the initial PSCs selected by IStaMPS than that by StaMPS. Thus, more reliable PSCs with a single dominant scatterer were effectively selected by the proposed IStaMPS. For example, some layover pixels were selected as PSCs by StaMPS from the comparison in Figure 7b,e. It was observed that more lounge bridges pixels on the left side (see the L region in Figure 3), most of which are overlaid by the roof scatter and ground scatter, were selected by StaMPS. For the last step, more adjacent PS pixels were kept by IStaMPS than by StaMPS from the comparison of Figure 7c,f. Stability of the final selected PS points was demonstrated from subsequent retrieving results.

T3 E Site
After selection of the measurement points, the residual topography and deformation rate in the radar LOS direction of measurement points were retrieved by StaMPS and IStaMPS. Their maps of reconstructed height and mean deformation rate are illustrated in Figures 8 and 9.  The overall height maps after adding back the coarse topography for both StaMPS and IStaMPS are shown in Figure 8. Since the airport lies in a flat area with an average altitude of about 35 m [47,48], and the maximum height of T3 E relative to the flat ground is about 47 m [49], the maximum value of absolute height is about 82 m. Although the realistic streamlined shape at top of the building was reconstructed by these two methods, the proposed IStaMPS provides a much better height map in the most likely layover region L. Besides, the height of left lounge bridges by StaMPS goes against the truth in the L region. Accordingly, IStaMPS has shown a significant improvement in measuring topography, especially for layover areas.
As shown in Figure 9, the deformation contour retrieved by IStaMPS is in good agreement with that by StaMPS. Moreover, the mean deformation rate of IStaMPS is in accord with the historic data in [37,47,48], demonstrating the effectiveness of IStaMPS in retrieving deformation. Although the two results have similar deformation patterns, the proposed strategy provides much denser measurements in building regions.
The SAR acquisition geometry along the S profile in range and elevation plane marked in the top-left sub-image of Figure 3 is shown in Figure 10, where both sides of lounge bridges and symmetric main structure of T3 E building can be observed. Due to the side-looking geometry, the layover area most likely happens in the leftmost lounge bridge L l , and shadow area most likely happens in the leftmost lounge bridge L r . These areas are challenging to the PS-InSAR technique. It is worth noting that the rightmost lounge bridge is slightly shorter than the left one. Thus, the regions of lounge bridges A 1 , A 2 , A 3 , A 4 , A 5 and A 6 were selected, whose roofs are of the same height relative to ground, to assess the reliability of IStaMPS in most likely layover areas. The mean heights of the common PS points in each region were compared to show the accuracy of both methods. The height statistics of each region are illustrated in Figure 11. It was observed that the mean heights of all regions were approximately equal for IStaMPS, while there was large bias between the left and the right for StaMPS. Thus, it can be seen that IStaMPS is more reliable than StaMPS. Because each region contains scattering from the roof and facade of lounge bridges, there exist some differences between the mean height of five regions. Accordingly, the std of each region is not zero in Figure 11. Subsequently, the advantage of IStaMPS over StaMPS in monitoring deformation is demonstrated in the following two aspects. On the one hand, the accuracy of retrieving height directly indicates the accuracy of mean deformation rate and thus the improved performance of IStaMPS over StaMPS in height reconstruction can be demonstrated by its improved performance in retrieving the mean deformation rate. On the other hand, the statistical results for the same region B marked with rectangle in Figure 3 were compared. The spatial statistical results of 25 observations are shown in Figure 12

Chaobai River
Due to the aforementioned PS selection process for T3 E area, here the reconstructed height and mean deformation rate maps retrieved by both methods were compared directly to discuss the performance of IStaMPS for non-urban areas. As shown in Figure 13, similar results were obtained by both methods in the natural area, while the proposed IStaMPS provided denser measurement points for all the man-made or natural targets on both sides of the river. Quantitatively, a total of 14,681 PS pixels and 37,771 PS pixels over the Chaobai River site were detected by StaMPS and IStaMPS, respectively. The density of the reliable measurement points detected by IStaMPS is about 2.6 times that by StaMPS. Since the pixels scattering from the river bridge share approximately equal height and deformation, the statistics of common points detected by both methods corresponding to region R of Figure 3 were compared quantitatively, as mentioned previously. The approximately equal mean height and deformation were obtained. However, the height std of StaMPS is 1.12 m, which is higher than the 0.

Deformation Time Series of the T3 E Building
A more detailed analysis from the IStaMPS results on the spatiotemporal characterization of land subsidence/uplift is discussed in this subsection. The acquisition on 22 January 2012 is set as the reference image, and the spatial-temporal subsidence and uplift evolution are clearly visible over time. As illustrated by cumulative deformation maps in Figure 14, LOS deformation time series over the T3E site identified by IStaMPS ranged from −46.84 mm to 40.32 mm during 2012-2015 for the TerraSAR-X datasets. The region close to the runways experienced a slow subsidence, while the northwest region of the T3 E building experienced a land uplift. This result is in agreement with previous studies in [37,47,48]. The maximum linear uplift rate is about 4.2 mm/year over the whole monitoring period (see Figure 9).

Conclusions
In this study, an improved PS-InSAR method, which can avoid the special overlaid pixels within multiple strong scatters for complex scenes and effectively remove the unstable adjacent pixels, was proposed based on the widely adopted method StaMPS. The PS pixels are selected with high density by the improved PS selection strategy. Moreover, the topography information not provided in StaMPS can be accurately measured in IStaMPS. A total of 25 TerraSAR-X/TanDEM-X Stripmap images acquired from June 2012 to August 2015 were used to test their performances. Although both methods have provided similar results in most regions, there are some differences, mainly concerning PS density and retrieving results in layover regions. The number of detected PS points is increased by 1.8 and 1.6 times for the T3 E building of the Beijing Capital International Airport and the Chaobai River area of Beijing Shunyi District, respectively. Moreover, the proposed IStaMPS achieved better results in topography measuring and deformation monitoring for complex buildings and natural scenes. Finally, the spatial-temporal evolution in the T3 E site was investigated with IStaMPS, and an exceptional uplift zone was detected in the northwest region of the T3 E building, with a maximum uplift rate of 4.2 mm/year.