Multi-Temporal InSAR Parallel Processing for Sentinel-1 Large-Scale Surface Deformation Mapping

: Interferometric synthetic aperture radar (InSAR) has achieved great success in various geodetic applications, and its potential for ground deformation measurements on the large scale has attracted increasingly more attention in recent years. The increasing number of synthetic aperture radar (SAR) satellite systems have steadily provided a large amount of SAR data. Among these systems, the Sentinel-1 mission can be considered a milestone in the development of InSAR techniques, o ﬀ ering new opportunities to monitor global surface deformation with high precision, due to its wide coverage, short revisit time, and free access. However, conventional InSAR techniques have encountered great challenges in large-scale InSAR processing over wide areas because of the large computational burden and complexity. In this work, we present a novel parallel computing-based coherent scatterer InSAR (P-CSInSAR) method for automatic and e ﬃ cient generation of deformation results from Sentinel-1 raw data. To achieve high parallelization performance for the overall InSAR processing chain, parallelization strategies at di ﬀ erent levels have been adopted in the P-CSInSAR method, which has been fully addressed in this work. To evaluate the e ﬃ ciency and accuracy of the proposed method, P-CSInSAR has been tested on the North China Plain regions with three adjacent frames of Sentinel-1 images, and the deformation results have been validated by GPS measurements. The experimental results conﬁrm the e ﬀ ectiveness of the proposed parallel computing-based P-CSInSAR method. The proposed method can also play an important role in exploiting Sentinel-1 InSAR big data for disaster prevention and reduction.


Introduction
Space-borne interferometric synthetic aperture radar has the ability to detect subtle surface deformations with wide coverage and high resolution; thus, it has been widely used in a range of geophysical applications, such as earthquakes, landslides, subsidence, and other natural or man-made hazards [1]. To overcome the limitations of temporal-spatial decorrelation and atmospheric effects in conventional interferometric synthetic aperture radar (InSAR) processing, multi-temporal InSAR (MT-InSAR) techniques, especially permanent scatterer interferometry (PSI) [2,3] and the small baseline method (SBAS) [4], have been developed by using the time series analysis of a stack of interferograms. These advanced processing algorithms [5] have achieved great success in obtaining ground deformation with high precision on the regional scale [6][7][8][9], and have made InSAR one of the most important remote sensing tools for risk prevention and monitoring. of the P-CSInSAR method has been tested on a real Sentinel-1 dataset over the North China Plain, acquired from September 2018 to March 2020. The accuracy of the estimated deformation results has been validated by GPS measurements, and the parallel computing efficiency of the proposed method has been evaluated. By developing the high-performance InSAR big data processing platform for the Sentinel-1 SAR dataset, we can realize the capability of routinely monitoring the widespread subsidence in the North China Plain and other city clusters in China, which will be very helpful for local governments in urban planning and policy making.
This paper is organized as follows. In Section 2, we describe the workflow of the proposed P-CSInSAR method and its parallelization strategies. In Section 3, the P-CSInSAR method is applied in a real case on the North China Plain, and the performance and accuracy of the proposed method are addressed. Finally, in Section 4, the conclusion and future developments of the present work are discussed.

Methodology
Developing an efficient parallel computational model for the Sentinel-1 MT-InSAR method is very challenging, as several aspects, such as load balancing, communication, and data dependencies, should be properly taken into account throughout the processing chain. The MT-InSAR method requires several processing steps, and the algorithms are faced upward with different data structures during the processing stages. Moreover, the sequential implementation of the MT-InSAR method has a strong dependency between different algorithm steps, and massive input/output (I/O) operations are performed during the processing. Therefore, to obtain a parallel solution for the MT-InSAR method with high efficiency and scalability, we should carefully design the parallel algorithm and organize the sequence of the MT-InSAR processing steps to minimize disk access, minimize interprocess communication, and maximize the usage of the CPU.
Through an analysis of the workflow and features of the Sentinel-1 MT-InSAR method, the parallelizability for each algorithm step has been studied. For specific algorithms deemed suitable for parallelization, the appropriate parallel strategies are selected. For instance, Sentienl-1 burst data are separated as independent images; thus, the processing for burst images can be naturally parallelized on the burst level. For those algorithms that cannot be directly implemented in parallel form, we should exploit the potential of data independency in the processing, and decompose the algorithm into the parallelizable and nonparallelizable parts; thus, the former will be executed in parallel. For example, in the time series analysis, image data can be divided into different blocks, while the calculation for these blocks is the parallelizable part, and the results fusion for blocks is the nonparallelizable part. By adopting the parallelism at different levels during the course of the processing, we can maximize the parallelization for the overall processing chain.
Based on above considerations, the framework of the proposed P-CSInSAR algorithm is mainly divided into three parts: Sentinel-1 Terrain Observation with Progressive Scans (TOPS)-mode image processing, Sentinel-1 interferometry, and time series analysis, as shown in Figure 1. We first provide an overview of the P-CSInSAR algorithm, and the detailed description is provided in Sections 2.1-2.3. The algorithm starts with preparation of the Sentinel-1 raw images, Digital Elevation Model (DEM), and the precise orbit files. After Sentinel-1 TOPS-mode image processing, a series of coregistrated SAR images and their parameter files are produced. Then, during Sentinel-1 interferometry, multilook interferograms with small baselines are generated, and the flat-Earth and topography phase are removed. Finally, through time series analysis, the deformation rate and displacement time series are obtained, and the output files can be exported into text, GeoTIFF, or shapefile formats. A large amount of intermediate results will save to the disk storage, and the program will delete some unused files if necessary. Moreover, the processing pipeline is written in Python and Bash scripts with predefined parameters, and can automatically generate deformation results from the raw Sentinel-1 data without any extra interaction work. In this part, the processing steps mainly include image coregistration and mosaicking. Unlike other SAR images, Sentinel-1 SLC (single look complex) data in IW mode use the Terrain Observation with Progressive Scans (TOPS) to achieve wide-swath coverage [31]. A TOPS-mode SAR image consists of three individual subswath files (IW1, IW2, and IW3), where each subswath contains a series of burst files (at least nine). There are overlapping areas between the consecutive bursts and subswaths, and there are invalid data in the black regions of burst images. Due to the azimuth antenna steering, Sentinel-1 SLC data have a stricter requirement for image coregistration, which demands a coregistration accuracy of 0.001 samples, and a small mis-coregistration will introduce an unwanted phase ramp into interferograms among adjacent bursts.
Assuming N Sentinel-1 SAR images of the same scene acquired at times 1 , 2 , . . . . . . . , , these SLC images are first unpacked from the raw zip files and subjected to data format conversion. Then, the precise orbit files are applied to update the orbit parameters, which are then used in the geometric coregistration. Through TOPS split processing, all burst images are extracted from each subswath image file and stored into separate burst image files. Sentinel-1 image coregistration is then performed on these burst images. However, due to the azimuth antenna steering, the deramping phase must be removed from the burst data prior to image resampling operation, while after image coregistration, the coregistrated burst images are compensated for the deramping phase. Thus, the deramping phase is estimated by the Doppler centroid frequency, and then stored in the disk to be  In this part, the processing steps mainly include image coregistration and mosaicking. Unlike other SAR images, Sentinel-1 SLC (single look complex) data in IW mode use the Terrain Observation with Progressive Scans (TOPS) to achieve wide-swath coverage [31]. A TOPS-mode SAR image consists of three individual subswath files (IW1, IW2, and IW3), where each subswath contains a series of burst files (at least nine). There are overlapping areas between the consecutive bursts and subswaths, and there are invalid data in the black regions of burst images. Due to the azimuth antenna steering, Sentinel-1 SLC data have a stricter requirement for image coregistration, which demands a coregistration accuracy of 0.001 samples, and a small mis-coregistration will introduce an unwanted phase ramp into interferograms among adjacent bursts.
Assuming N Sentinel-1 SAR images of the same scene acquired at times t 1 , t 2 , . . . . . . , t N , these SLC images are first unpacked from the raw zip files and subjected to data format conversion. Then, the precise orbit files are applied to update the orbit parameters, which are then used in the geometric coregistration. Through TOPS split processing, all burst images are extracted from each subswath image file and stored into separate burst image files. Sentinel-1 image coregistration is then performed on these burst images. However, due to the azimuth antenna steering, the deramping phase must be removed from the burst data prior to image resampling operation, while after image coregistration, the coregistrated burst images are compensated for the deramping phase. Thus, the deramping phase is estimated by the Doppler centroid frequency, and then stored in the disk to be used in the deramp step and reramp step. SAR image coregistration transforms all the slave images to the master reference Remote Sens. 2020, 12, 3749 5 of 20 geometry, and the good coregistration is vital to generating correct interferograms, especially for Sentinel-1 images. To achieve high coregistration accuracy, a two-step method, the initial coregistration and fine coregistration, is widely used in Sentinel-1 image processing.
Geometric coregistration is used as the first-order coregistration by the use of precise orbit state vectors and an external DEM. This method transforms the pixel from SAR coordinates to Cartesian reference coordinates based on the range-Doppler (RD) equations, and the range and azimuth offsets between the master and slave images can be estimated. Then, slave image resampling and interpolation are performed by using the sinc kernel. Geometric coregistration does not rely on SAR image coherence, and its accuracy greatly benefits from the precise orbit file of Sentinel-1. The processing step is performed on all the burst data for each master-slave image pair at full spatial resolution; thus, the process requires intensive memory and I/O access, and is one of the most time-consuming steps during Sentinel-1 image TOPS-mode processing.
After the initial coregistration, the enhanced spectral diversity (ESD) method [32] is applied to further correct the residual azimuth coregistration errors. The algorithm exploits the phase difference in the overlapping area among adjacent bursts, and uses the difference phase to retrieve the azimuth offset. First, the ESD phase is calculated by the double difference for every overlapping area, and the ESD phase for Burst i and j can be given as follows: where m and s refer to the master and slave images, respectively. The relation between the ESD phase and azimuth offset ∆a can be given as follows: where ∆ f d is the Doppler frequency difference in the overlap areas between two bursts, and f a is the azimuth sampling frequency. Thus, we can obtain the azimuth offset ∆a based on Equations (1) and (2). The ESD algorithm is implemented on all the master-slave pairs. To reduce the potential temporal decorrelation effect of some pairs, a joint time series coregistration for redundant image pairs [33] is used to obtain more accurate offsets. In this work, the following pairs are selected for ESD calculation, as shown in Figure 2.
Remote Sens. 2020, 12, x FOR PEER REVIEW 5 of 19 used in the deramp step and reramp step. SAR image coregistration transforms all the slave images to the master reference geometry, and the good coregistration is vital to generating correct interferograms, especially for Sentinel-1 images. To achieve high coregistration accuracy, a two-step method, the initial coregistration and fine coregistration, is widely used in Sentinel-1 image processing. Geometric coregistration is used as the first-order coregistration by the use of precise orbit state vectors and an external DEM. This method transforms the pixel from SAR coordinates to Cartesian reference coordinates based on the range-Doppler (RD) equations, and the range and azimuth offsets between the master and slave images can be estimated. Then, slave image resampling and interpolation are performed by using the sinc kernel. Geometric coregistration does not rely on SAR image coherence, and its accuracy greatly benefits from the precise orbit file of Sentinel-1. The processing step is performed on all the burst data for each master-slave image pair at full spatial resolution; thus, the process requires intensive memory and I/O access, and is one of the most timeconsuming steps during Sentinel-1 image TOPS-mode processing.
After the initial coregistration, the enhanced spectral diversity (ESD) method [32] is applied to further correct the residual azimuth coregistration errors. The algorithm exploits the phase difference in the overlapping area among adjacent bursts, and uses the difference phase to retrieve the azimuth offset. First, the ESD phase is calculated by the double difference for every overlapping area, and the ESD phase for Burst i and j can be given as follows: where m and s refer to the master and slave images, respectively. The relation between the ESD phase and azimuth offset can be given as follows: where is the Doppler frequency difference in the overlap areas between two bursts, and is the azimuth sampling frequency. Thus, we can obtain the azimuth offset based on Equations (1) and (2). The ESD algorithm is implemented on all the master-slave pairs. To reduce the potential temporal decorrelation effect of some pairs, a joint time series coregistration for redundant image pairs [33] is used to obtain more accurate offsets. In this work, the following pairs are selected for ESD calculation, as shown in Figure 2. The ESD algorithm is implemented on the above-selected pairs, and the estimated azimuth offsets = [ 0−1 , 0−2 , . . . , ( −2)−( −1) ] are obtained. Thus, the azimuth offsets relative to the master image can be calculated by using the least squares method: where is the system matrix defining the relation between the different offsets. By using this redundant offset calculation approach, we can obtain more accurate and reliable coregistrated images.
After image coregistration and reramp processing, these burst images will mosaic together in the burst and subswath merging step. Then, the whole coregistrated time series of SLC images are obtained, which can be used in the following interferometry steps. The burst and subswath merging step is the most memory-intensive step because the entire image data with the original resolution must be read into memory. The ESD algorithm is implemented on the above-selected pairs, and the estimated azimuth offsets DAz = DAz 0−1 , DAz 0−2 , . . . , DAz (N−2)−(N−1) are obtained. Thus, the azimuth offsets relative to the master image can be calculated by using the least squares method: where A is the system matrix defining the relation between the different offsets. By using this redundant offset calculation approach, we can obtain more accurate and reliable coregistrated images. After image coregistration and reramp processing, these burst images will mosaic together in the burst and subswath merging step. Then, the whole coregistrated time series of SLC images are obtained, which can be used in the following interferometry steps. The burst and subswath merging step is the most memory-intensive step because the entire image data with the original resolution must be read into memory. Although the Sentinel-1 TOPS-mode has caused some problems to the SAR image coregistration, its unique subswath and burst structure has provided good opportunities for parallelizing the processing algorithms. The parallelization strategies for this step are straightforward, and the data-center parallel approach is adopted, whereby a series of process threads apply the same operation on the different data.
From the previous algorithm description, the burst and subswath data are separated as the independent image files after the TOPS split step. Then, deramp, geometric coregistration, ESD coregistration, and reramp algorithms are all implemented on these burst images, while burst and subswath merging is performed on the three subswath images. Moreover, both the burst and subswath files are almost the same in size. Thus, the algorithms during this processing can be easily parallelized without having to consider the problem of data dependency or load unbalancing, leading to these algorithms being executed in parallel on different burst or subswath images, which can be intrinsically distributed to multiple process threads, as shown in Figure 3.
During this stage, the algorithms process the image data with its original spatial resolution, and they are the most memory-and storage-intensive steps in the whole processing chain. In addition, the output intermediate results are almost three times the size of the unpacked raw Sentinel-1 images, and these massive I/O operations put excessive pressure on the hard disk at the beginning and the end of each algorithm step, especially for parallel processing. Thus, the speed of disk access has become a major limitation in parallel efficiency improvement for Sentinel-1 image TOPS-mode processing. Although the Sentinel-1 TOPS-mode has caused some problems to the SAR image coregistration, its unique subswath and burst structure has provided good opportunities for parallelizing the processing algorithms. The parallelization strategies for this step are straightforward, and the datacenter parallel approach is adopted, whereby a series of process threads apply the same operation on the different data.
From the previous algorithm description, the burst and subswath data are separated as the independent image files after the TOPS split step. Then, deramp, geometric coregistration, ESD coregistration, and reramp algorithms are all implemented on these burst images, while burst and subswath merging is performed on the three subswath images. Moreover, both the burst and subswath files are almost the same in size. Thus, the algorithms during this processing can be easily parallelized without having to consider the problem of data dependency or load unbalancing, leading to these algorithms being executed in parallel on different burst or subswath images, which can be intrinsically distributed to multiple process threads, as shown in Figure 3.
During this stage, the algorithms process the image data with its original spatial resolution, and they are the most memory-and storage-intensive steps in the whole processing chain. In addition, the output intermediate results are almost three times the size of the unpacked raw Sentinel-1 images, and these massive I/O operations put excessive pressure on the hard disk at the beginning and the end of each algorithm step, especially for parallel processing. Thus, the speed of disk access has become a major limitation in parallel efficiency improvement for Sentinel-1 image TOPS-mode processing.

Algorithm Steps for Sentinel-1 Interferometry
This part aims to generate small baseline multilook interferograms and an average coherence map for time series analysis. The processing steps mainly include interferometric pair selection, interferometry processing, removing flat-Earth and topography phases (differential interferometry processing), and multilook filtering.
First, the interferometric pairs with short temporal and spatial baselines are selected according to the given threshold. As Sentinel-1 is characterized by its small baseline, the selected interferometric pairs by the threshold are still too numerous for later processing. To search the optimally connected baseline network, we estimate the coherence for all interferometric pair candidates by randomly selecting several blocks from the image and calculating the average coherence of these blocks. To ensure the accuracy of coherence estimation, the block size should not be too small (no less than 16 × 16), and the block size in our experiment is 64 × 64. Thus, the interferometric pairs with higher coherence are obtained.

Algorithm Steps for Sentinel-1 Interferometry
This part aims to generate small baseline multilook interferograms and an average coherence map for time series analysis. The processing steps mainly include interferometric pair selection, interferometry processing, removing flat-Earth and topography phases (differential interferometry processing), and multilook filtering.
First, the interferometric pairs with short temporal and spatial baselines are selected according to the given threshold. As Sentinel-1 is characterized by its small baseline, the selected interferometric pairs by the threshold are still too numerous for later processing. To search the optimally connected baseline network, we estimate the coherence for all interferometric pair candidates by randomly selecting several blocks from the image and calculating the average coherence of these blocks. To ensure the accuracy of coherence estimation, the block size should not be too small (no less than 16 × 16), and the block size in our experiment is 64 × 64. Thus, the interferometric pairs with higher coherence are obtained.
Then, we generate the interferograms by complex conjugate calculation in the interferometry processing for all the interferometric pairs. The external DEM is used to model and remove the flat-Earth Remote Sens. 2020, 12, 3749 7 of 20 and topography phase contributions to generate the differential interferograms. Multilook filtering is the spatial complex average for differential interferograms used to reduce the decorrelation noise and generate the spatial coherence map. To downsample the size of the processing data, 10 and 2 (or 20 and 4) looks in range and azimuth directions, respectively, are typically used in InSAR processing.
To save disk storage and reduce I/O operations, complex image processing for interferometry, differential interferometry, and multilook filtering can be combined without the need to output intermediate results. In addition, some interferometry-related parameters (spatial baseline, slant range, etc.) at each range sample and azimuth line for all the interferometric pairs are also calculated in this part.

Parallelization Strategies for Sentinel-1 Interferometry
In the interferometric pair selection step, parallel computing can be naturally applied to the independent selected blocks to calculate the coherence. Thus, we mainly focus on parallelization for subsequent interferometry and multilook filtering processing.
These algorithms are mainly divided into two groups, as shown in Figure 4. One is operated on the pixel basis, such as interferometry, flat-Earth and topography estimation, differential interferometry, and interferometric parameter calculation. The pixel calculations are independent of each other, and access to data from other pixels is not required. The other is computed on the window basis, which requires operation on a local region of the image, such as multilook filtering. The window processing also has no dependencies on each other in the calculation. Both types of algorithms are local operators with a good separation of neighboring units. Then, we generate the interferograms by complex conjugate calculation in the interferometry processing for all the interferometric pairs. The external DEM is used to model and remove the flat-Earth and topography phase contributions to generate the differential interferograms. Multilook filtering is the spatial complex average for differential interferograms used to reduce the decorrelation noise and generate the spatial coherence map. To downsample the size of the processing data, 10 and 2 (or 20 and 4) looks in range and azimuth directions, respectively, are typically used in InSAR processing.
To save disk storage and reduce I/O operations, complex image processing for interferometry, differential interferometry, and multilook filtering can be combined without the need to output intermediate results. In addition, some interferometry-related parameters (spatial baseline, slant range, etc.) at each range sample and azimuth line for all the interferometric pairs are also calculated in this part.

Parallelization Strategies for Sentinel-1 Interferometry
In the interferometric pair selection step, parallel computing can be naturally applied to the independent selected blocks to calculate the coherence. Thus, we mainly focus on parallelization for subsequent interferometry and multilook filtering processing.
These algorithms are mainly divided into two groups, as shown in Figure 4. One is operated on the pixel basis, such as interferometry, flat-Earth and topography estimation, differential interferometry, and interferometric parameter calculation. The pixel calculations are independent of each other, and access to data from other pixels is not required. The other is computed on the window basis, which requires operation on a local region of the image, such as multilook filtering. The window processing also has no dependencies on each other in the calculation. Both types of algorithms are local operators with a good separation of neighboring units.  Therefore, algorithm parallelization for Sentinel-1 interferometry processing requires little effort to partition the processing data, and parallel computing can either be performed pixel by pixel, or block by block after dividing the image into different blocks. To make use of file reading, parallelism on the image line level is adopted. The SAR image files are read by line, and these lines of data are then distributed to the different process threads, as shown in Figure 5.  Therefore, algorithm parallelization for Sentinel-1 interferometry processing requires little effort to partition the processing data, and parallel computing can either be performed pixel by pixel, or block by block after dividing the image into different blocks. To make use of file reading, parallelism on the image line level is adopted. The SAR image files are read by line, and these lines of data are then distributed to the different process threads, as shown in Figure 5. Then, we generate the interferograms by complex conjugate calculation in the interferometry processing for all the interferometric pairs. The external DEM is used to model and remove the flat-Earth and topography phase contributions to generate the differential interferograms. Multilook filtering is the spatial complex average for differential interferograms used to reduce the decorrelation noise and generate the spatial coherence map. To downsample the size of the processing data, 10 and 2 (or 20 and 4) looks in range and azimuth directions, respectively, are typically used in InSAR processing.
To save disk storage and reduce I/O operations, complex image processing for interferometry, differential interferometry, and multilook filtering can be combined without the need to output intermediate results. In addition, some interferometry-related parameters (spatial baseline, slant range, etc.) at each range sample and azimuth line for all the interferometric pairs are also calculated in this part.

Parallelization Strategies for Sentinel-1 Interferometry
In the interferometric pair selection step, parallel computing can be naturally applied to the independent selected blocks to calculate the coherence. Thus, we mainly focus on parallelization for subsequent interferometry and multilook filtering processing.
These algorithms are mainly divided into two groups, as shown in Figure 4. One is operated on the pixel basis, such as interferometry, flat-Earth and topography estimation, differential interferometry, and interferometric parameter calculation. The pixel calculations are independent of each other, and access to data from other pixels is not required. The other is computed on the window basis, which requires operation on a local region of the image, such as multilook filtering. The window processing also has no dependencies on each other in the calculation. Both types of algorithms are local operators with a good separation of neighboring units.  Therefore, algorithm parallelization for Sentinel-1 interferometry processing requires little effort to partition the processing data, and parallel computing can either be performed pixel by pixel, or block by block after dividing the image into different blocks. To make use of file reading, parallelism on the image line level is adopted. The SAR image files are read by line, and these lines of data are then distributed to the different process threads, as shown in Figure 5.  The algorithm computations are not complex in Sentinel-1 interferometry processing, but they require intensive I/O operations due to the large number of interferograms and interferometry-related parameter files. Thus, the speed of disk access may influence the computing time of this processing step.

Algorithm Steps for Time Series Analysis
This part carries out the time series analysis of the interferometric phase for the coherent scatterers, and estimates the deformation results from wrapped interferograms. This processing stage mainly includes coherent scatterer selection, deformation parameter estimation, network integration, residual phase unwrapping, and atmospheric filtering.
The high-coherent-scatterer candidates are identified by the average coherence map based on the given threshold, and the interferometric phase and interferometric parameters for these coherent scatterers are extracted from the corresponding files. Then, the following steps are performed on the sparse pixels with less memory and storage consumption. Coherent scatterers are first connected with the Delaunay triangulation network, and the atmospheric phase is mitigated through the double-phase difference of the adjacent pixel at the same arc. Deformation and residual height parameter estimation is performed on all the arcs of the triangulation network through time series analysis of the wrapped phase.
For the double-differential phase ∆ϕ ij for Arc i-j, after removing the atmospheric phase, the residual topography phase ϕ topo , deformation phase ϕ de f , and noise ε are remained. The residual topography phase and deformation phase are usually modeled as the known functions of the temporal baseline T and spatial perpendicular baseline B, respectively. Thus, the double-differential phase can be given as follows: where v ij and h ij are the estimated deformation rate and height parameters, respectively. d T; v ij is the deformation model, and usually the polynomial or seasonal model. β is the height-to-phase conversion factor. As ∆ϕ ij is the wrapped phase and cannot be directly calculated by matrix inversion, the parameters are estimated by the following optimization method: where γ ij is the temporal coherence and K is the interferogram number. The optimal parameters v ij and h ij can be obtained by searching in the solution space with various optimization techniques [34]. At the same time, some poorly estimated arc results will be rejected from the network if the temporal coherence γ ij is below the given threshold. Finally, the reliable coherent scatters are selected in this step. After estimating the deformation and height parameters for all the arcs in the network, the differential results are integrated by the network adjustment. To improve the inversion results of the adjustment matrix with the least squares solution, a weighted ridge-estimator is adopted to obtain more robust results by adding a regularization constraint, given as follows: where A is the adjustment matrix that defines the relation of the arcs. W is the weighted matrix, and the diagonal elements are the temporal coherence for the arcs; σ is a positive constant of the ridge-estimator parameter; D is the differential results of the estimated deformation rate or height for every arc; and V is the final estimated result with respect to the reference pixel. Thus, the linear deformation rate and residual height results can be obtained from Equation (6). Then, a series of residual phases are acquired by subtracting the estimated linear deformation and height phase from the differential phase, and the phase unwrapping is then used to retrieve the Remote Sens. 2020, 12, 3749 9 of 20 integer phase ambiguity for residual interferograms before estimating the atmospheric and nonlinear deformation phase. This process is performed by applying a two-dimensional spatial minimum cost flow (MCF) phase unwrapping algorithm [35], which is one of the most demanding steps in time series analysis. The unwrapped residual interferograms ∆ϕ for the selected small baseline interferometric pairs are obtained after unwrapping processing, and they can be expressed by the following function: where ϕ is the residual-phase time series with respect to the first acquisition time t 1 , and B is the system matrix that defines the relation of the interferometric pairs. The solution for ϕ can be seen as an L2-norm minimizing problem. The singular value decomposition (SVD) method is implemented to solve this minimizing model by decomposing the system matrix B as B = USV T , while the system matrix is common for all coherent scatterers. Thus, the residual-phase time series relative to the first acquisition time t 1 are acquired in this step.
To estimate and remove the atmospheric phase contribution from the residual phase time series, the atmospheric filtering is performed based on the spatial-temporal characteristics of the atmospheric phase. The atmospheric phase time series can be estimated by temporal high-pass filtering and spatial low-pass filtering [[ϕ res ] HP_time ] LP_space . Thus, removing the atmospheric phase from the residual-phase time series and the final displacement time series can be obtained. Once all the frames have been processed, mosaicking of the results is carried out to generate the deformation results for the entire study regions by employing the overlapping regions among different frames or by integrating the GPS measurements.

Parallelization Strategies for Time Series Analysis
From the above processing steps, we can see that multiple types of algorithms are involved in the time series analysis, including the temporal dimension algorithm (deformation estimation step), spatial dimension algorithm (network integration and phase unwrapping step), and temporal-spatial hybrid dimension algorithm (atmospheric filtering step). Thus, very different parallelization strategies are adopted to address these different algorithmic structures.
The conventional process of directly connecting millions of coherent scatterers with a triangulation network will lead to solving a large-scale matrix in network integration; thus, the computation in this step is of low efficiency. To reduce the computing scale and optimize the network calculation, parallelization is realized by splitting the image into different blocks, and block-level parallelism is adopted in this step. The image is split based on the density of the coherent scatterer to balance the computing load; thus, a series of unequal-sized blocks are obtained. As a result, the original network is replaced by several small-scale networks, which means that the large-scale matrix is decomposed into a series of submatrices. In addition, the calculations for these small submatrices can run in parallel on the different process threads, as shown in Figure 6. In this way, the computational efficiency of network integration can be greatly improved.
Remote Sens. 2020, 12, x FOR PEER REVIEW 9 of 19 integer phase ambiguity for residual interferograms before estimating the atmospheric and nonlinear deformation phase. This process is performed by applying a two-dimensional spatial minimum cost flow (MCF) phase unwrapping algorithm [35], which is one of the most demanding steps in time series analysis. The unwrapped residual interferograms for the selected small baseline interferometric pairs are obtained after unwrapping processing, and they can be expressed by the following function: where is the residual-phase time series with respect to the first acquisition time 1 , and is the system matrix that defines the relation of the interferometric pairs. The solution for can be seen as an L2-norm minimizing problem. The singular value decomposition (SVD) method is implemented to solve this minimizing model by decomposing the system matrix as = , while the system matrix is common for all coherent scatterers. Thus, the residual-phase time series relative to the first acquisition time 1 are acquired in this step.
To estimate and remove the atmospheric phase contribution from the residual phase time series, the atmospheric filtering is performed based on the spatial-temporal characteristics of the atmospheric phase. The atmospheric phase time series can be estimated by temporal high-pass filtering and spatial low-pass filtering [[ ] _ ] _ . Thus, removing the atmospheric phase from the residual-phase time series and the final displacement time series can be obtained. Once all the frames have been processed, mosaicking of the results is carried out to generate the deformation results for the entire study regions by employing the overlapping regions among different frames or by integrating the GPS measurements.

Parallelization Strategies for Time Series Analysis
From the above processing steps, we can see that multiple types of algorithms are involved in the time series analysis, including the temporal dimension algorithm (deformation estimation step), spatial dimension algorithm (network integration and phase unwrapping step), and temporal-spatial hybrid dimension algorithm (atmospheric filtering step). Thus, very different parallelization strategies are adopted to address these different algorithmic structures.
The conventional process of directly connecting millions of coherent scatterers with a triangulation network will lead to solving a large-scale matrix in network integration; thus, the computation in this step is of low efficiency. To reduce the computing scale and optimize the network calculation, parallelization is realized by splitting the image into different blocks, and block-level parallelism is adopted in this step. The image is split based on the density of the coherent scatterer to balance the computing load; thus, a series of unequal-sized blocks are obtained. As a result, the original network is replaced by several small-scale networks, which means that the large-scale matrix is decomposed into a series of submatrices. In addition, the calculations for these small submatrices can run in parallel on the different process threads, as shown in Figure 6. In this way, the computational efficiency of network integration can be greatly improved.  After establishing the network, the deformation estimation can be carried out for the wrapped phase of all arcs on the temporal domain. As the calculations among different arcs are independent of each other, the arc-based parallelization strategy is adopted in this stage, and the deformation estimation for these arcs runs in parallel on the different process threads. The next step is phase unwrapping for residual interferograms. The MCF phase unwrapping algorithm is a global optimization method, and the different parts of the image have strong dependencies under the global constraint. To avoid potential unwrapping errors by improper image partitioning, the task-parallel method is adopted by distributing the wrapped residual images to the different process threads, and the MCF algorithm operates on these process threads in parallel to complete all the tasks, as shown in Figure 7.
Then, the unwrapped residual interferograms follow up with SVD calculation, which is conducted for each coherent scatterer in the temporal domain independently; thus, its parallelization strategy is very similar to the Sentinel-1 interferometry by parallel computing on a pixel basis. The final step is atmospheric filtering, which consists of two-step spatial and temporal operations. The temporal operation is the same for SVD calculation as a pixel operator in the temporal domain and the pixel-level parallelism is adopted, while the spatial filtering can use the window-based parallel computing method with parallelism similar to the multilook filtering.
Time series analysis is one of the most complicated steps in Sentinel InSAR processing due to the temporal-spatial three-dimensional structure data, and the time consumption in this step is close to that of Sentinel-1 image coregistration. Moreover, the time series analysis step deals with sparse pixels and has smaller I/O operation and storage requirements than the previous steps.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 19 After establishing the network, the deformation estimation can be carried out for the wrapped phase of all arcs on the temporal domain. As the calculations among different arcs are independent of each other, the arc-based parallelization strategy is adopted in this stage, and the deformation estimation for these arcs runs in parallel on the different process threads. The next step is phase unwrapping for residual interferograms. The MCF phase unwrapping algorithm is a global optimization method, and the different parts of the image have strong dependencies under the global constraint. To avoid potential unwrapping errors by improper image partitioning, the task-parallel method is adopted by distributing the wrapped residual images to the different process threads, and the MCF algorithm operates on these process threads in parallel to complete all the tasks, as shown in Figure 7.
Then, the unwrapped residual interferograms follow up with SVD calculation, which is conducted for each coherent scatterer in the temporal domain independently; thus, its parallelization strategy is very similar to the Sentinel-1 interferometry by parallel computing on a pixel basis. The final step is atmospheric filtering, which consists of two-step spatial and temporal operations. The temporal operation is the same for SVD calculation as a pixel operator in the temporal domain and the pixel-level parallelism is adopted, while the spatial filtering can use the window-based parallel computing method with parallelism similar to the multilook filtering.
Time series analysis is one of the most complicated steps in Sentinel InSAR processing due to the temporal-spatial three-dimensional structure data, and the time consumption in this step is close to that of Sentinel-1 image coregistration. Moreover, the time series analysis step deals with sparse pixels and has smaller I/O operation and storage requirements than the previous steps.

Study Area and Dataset
We apply the proposed P-CSInSAR method in the previous section to the North China Plain, with three frames of Sentinel-1 SLC images along the same track (142-1, 142-2, and 142-3). The North China Plain has one of the largest subsidence bowls in the world, and most of the cities around this region have suffered serious ground subsidence problems, which have become the major geological hazards inhibiting the sustainability of regional development. The three frames of Sentinel-1 images have been acquired during the September 2018-March 2020 time span, composed of 135 acquisitions, with descending track 142. The study region has covered Beijing, Tianjin, Hebei, and Shandong provinces with an overall area of more than 120,000 km 2 , and the location of this region in China is shown in Figure 8a with red boxes. The outlines of the Sentinel-1 frames are shown in Figure 8b with black boxes, and the GNSS stations of the China Earthquake Administration are shown as red dots.

Study Area and Dataset
We apply the proposed P-CSInSAR method in the previous section to the North China Plain, with three frames of Sentinel-1 SLC images along the same track (142-1, 142-2, and 142-3). The North China Plain has one of the largest subsidence bowls in the world, and most of the cities around this region have suffered serious ground subsidence problems, which have become the major geological hazards inhibiting the sustainability of regional development. The three frames of Sentinel-1 images have been acquired during the September 2018-March 2020 time span, composed of 135 acquisitions, with descending track 142. The study region has covered Beijing, Tianjin, Hebei, and Shandong provinces with an overall area of more than 120,000 km 2 , and the location of this region in China is shown in Figure 8a with red boxes. The outlines of the Sentinel-1 frames are shown in Figure 8b with black boxes, and the GNSS stations of the China Earthquake Administration are shown as red dots.

The Comparsion Experiment
In this experiment, we compare our method with SNAP/GMTSAR -GIANT software packages over the Scene 142-2 with 45 images. The P-CSInSAR method is first applied in this region. The image at time 20180912 is the master image. After image coregistration, 95 interferograms with the perpendicular baseline less than 180 m, temporal baseline less than 50 days, and with higher coherence are generated, and the multilook numbers are 10 and 2 in the range and azimuth direction, respectively. Through time series analysis, more than 1.8 million scatterers are selected and the temporal coherence threshold is set to 0.7. As for the second method, SNAP or GMTSAR can only perform the Sentinel-1 SAR image coregistration and interferometry, and a total of 101 interferograms are generated with the same baseline threshold and multilook numbers. Then, GIANT is used to perform the time series analysis, and the ERA5 atmospheric model is selected in the atmospheric phase correction. The deformation results by two methods are shown with the same colorbar in Figure 9, and the processing times for different steps are listed in Table 1.

The Comparsion Experiment
In this experiment, we compare our method with SNAP/GMTSAR -GIANT software packages over the Scene 142-2 with 45 images. The P-CSInSAR method is first applied in this region. The image at time 20180912 is the master image. After image coregistration, 95 interferograms with the perpendicular baseline less than 180 m, temporal baseline less than 50 days, and with higher coherence are generated, and the multilook numbers are 10 and 2 in the range and azimuth direction, respectively. Through time series analysis, more than 1.8 million scatterers are selected and the temporal coherence threshold is set to 0.7. As for the second method, SNAP or GMTSAR can only perform the Sentinel-1 SAR image coregistration and interferometry, and a total of 101 interferograms are generated with the same baseline threshold and multilook numbers. Then, GIANT is used to perform the time series analysis, and the ERA5 atmospheric model is selected in the atmospheric phase correction. The deformation results by two methods are shown with the same colorbar in Figure 9, and the processing times for different steps are listed in Table 1.

The Comparsion Experiment
In this experiment, we compare our method with SNAP/GMTSAR -GIANT software packages over the Scene 142-2 with 45 images. The P-CSInSAR method is first applied in this region. The image at time 20180912 is the master image. After image coregistration, 95 interferograms with the perpendicular baseline less than 180 m, temporal baseline less than 50 days, and with higher coherence are generated, and the multilook numbers are 10 and 2 in the range and azimuth direction, respectively. Through time series analysis, more than 1.8 million scatterers are selected and the temporal coherence threshold is set to 0.7. As for the second method, SNAP or GMTSAR can only perform the Sentinel-1 SAR image coregistration and interferometry, and a total of 101 interferograms are generated with the same baseline threshold and multilook numbers. Then, GIANT is used to perform the time series analysis, and the ERA5 atmospheric model is selected in the atmospheric phase correction. The deformation results by two methods are shown with the same colorbar in Figure 9, and the processing times for different steps are listed in Table 1.
(a) (b) Figure 9. The deformation rate for Scene 142-2 estimated by P-CSInSAR (a) and GIANT (b). Figure 9. The deformation rate for Scene 142-2 estimated by P-CSInSAR (a) and GIANT (b).  Figure 9, we can see that the deformation results estimated by our method are very similar to the results by GIANT, and the major subsidence regions can be identified from both deformation maps. The small differences between two deformation rate maps are mainly caused by the atmospheric effect and the different interferograms selected in two methods. As for the processing time, it can be seen from Table 1 that the P-CSInSAR method has greatly improved the efficiency of Sentinel-1 InSAR processing, reducing a week's work of conventional processing to 14 hours, and the processing for all the three frames can be finished in several days. The highly automatic processing chain saves considerable human and time costs.

Results
In this experiment, the three frames of Sentinel-1 images are processed by the P-CSInSAR method, and the total number of Sentinel-1 acquisitions is 135. The image at time 20180912 is taken as the master image, and all the other images are coregistrated to this reference image. TanDEM-X (90 m) is used in the geometric coregistration and differential interferometry processing. The interferometric pairs are selected with a small temporal baseline of 50 days and a small spatial baseline of 180 m, and they are further selected according to the estimated coherence. As a result, the total number of the interferograms for the three frames is 310. We multilook each interferogram with 10 and 2 looks in the range and azimuth direction, respectively, followed by smooth filtering with a window length of 8 pixels. The coherence threshold for the average coherence map to select coherent scatterers is set to 0.7, and more than 6.5 million scatterers are identified. Through the time series analysis of the stack of interferograms, the deformation products are finally acquired, including the mean deformation rate and displacement time series, with a total of 5.4 million measurements. The workflow is automatically executed based on a Bash script. After finishing all the InSAR processing, a fusion operation is carried out to merge the deformation results for three frames, and the mean LOS velocity map for the whole area is displayed in Figure 10.
In Figure 11, we can see very serious subsidence in most of the cities around this region, which has been extensively investigated by many researchers, and our results have shown similar subsidence patterns with these previous works [36][37][38][39]. Several subsidence centers can be identified from the mean velocity map. Among them, Beijing-Tianjin regions have a deformation rate more than 60 mm/year, including the Tongzhou, Langfang, and Tianjin district, and subsidence in this region has shown a trend of gradually growing together. To illustrate the InSAR deformation results in the Beijing-Tianjin region with the geodetic measurements [40], both results are shown in Figure 11. scatterers is set to 0.7, and more than 6.5 million scatterers are identified. Through the time series analysis of the stack of interferograms, the deformation products are finally acquired, including the mean deformation rate and displacement time series, with a total of 5.4 million measurements. The workflow is automatically executed based on a Bash script. After finishing all the InSAR processing, a fusion operation is carried out to merge the deformation results for three frames, and the mean LOS velocity map for the whole area is displayed in Figure 10. In Figure 11, we can see very serious subsidence in most of the cities around this region, which has been extensively investigated by many researchers, and our results have shown similar subsidence patterns with these previous works [36][37][38][39]. Several subsidence centers can be identified from the mean velocity map. Among them, Beijing-Tianjin regions have a deformation rate more than 60 mm/year, including the Tongzhou, Langfang, and Tianjin district, and subsidence in this region has shown a trend of gradually growing together. To illustrate the InSAR deformation results in the Beijing-Tianjin region with the geodetic measurements [40], both results are shown in Figure 11. Figure 11. The distribution of deep and shallow ground water, the major subsidence region in Beijing-Tianjin, and the deformation results estimated by our method.
From Figure 11, the subsidence in Tongzhou and Chaoyang districts of Beijing and Langfang are located in the same areas with the geological investigations, and the main reason in this region is the expansion of cities and populations. The subsidence in the central urban areas of Tianjin is not significant, because strict groundwater protection measures have been enforced in recent years. In addition, the most serious region in Tianjin is located in Wuqing district, with a maximum subsidence more than 110 mm, while the subsidence in North Xiongxian County (38.9°N, 116.2°E) is closely related to the shallow groundwater funnel. Therefore, the deformation results estimated by our method agree well with the geodetic measurements.
Due to the overexploitation of underground water for agricultural and domestic usage, subsidence in Hebei province is relatively severe and complex. From Figure 10, we can see many strong subsidence bowls in central Hebei, and the subsidence rates in some areas exceed 120 mm/year. To validate the InSAR deformation results in this region, the groundwater-level measurements in Cangzhou and Hengshui urban areas are compared with InSAR displacements from Jan 2019 to July 2019 in Figure 12a,b respectively. Figure 11. The distribution of deep and shallow ground water, the major subsidence region in Beijing-Tianjin, and the deformation results estimated by our method.
From Figure 11, the subsidence in Tongzhou and Chaoyang districts of Beijing and Langfang are located in the same areas with the geological investigations, and the main reason in this region is the expansion of cities and populations. The subsidence in the central urban areas of Tianjin is not significant, because strict groundwater protection measures have been enforced in recent years. In addition, the most serious region in Tianjin is located in Wuqing district, with a maximum subsidence more than 110 mm, while the subsidence in North Xiongxian County (38.9 • N, 116.2 • E) is closely related to the shallow groundwater funnel. Therefore, the deformation results estimated by our method agree well with the geodetic measurements.
Due to the overexploitation of underground water for agricultural and domestic usage, subsidence in Hebei province is relatively severe and complex. From Figure 10, we can see many strong subsidence bowls in central Hebei, and the subsidence rates in some areas exceed 120 mm/year. To validate the InSAR deformation results in this region, the groundwater-level measurements in Cangzhou and Hengshui urban areas are compared with InSAR displacements from January 2019 to July 2019 in Figure 12a,b respectively.
Due to the overexploitation of underground water for agricultural and domestic usage, subsidence in Hebei province is relatively severe and complex. From Figure 10, we can see many strong subsidence bowls in central Hebei, and the subsidence rates in some areas exceed 120 mm/year. To validate the InSAR deformation results in this region, the groundwater-level measurements in Cangzhou and Hengshui urban areas are compared with InSAR displacements from Jan 2019 to July 2019 in Figure 12a From Figure 12, we can see that the groundwater levels have remained in decline from January to June 2019, and then rise in July due to the increase in rainfall. The InSAR displacements have also shown the same trend with the groundwater level changes in the two places, which has proven that our deformation results are reasonable.
The major subsidence areas of Hebei province are located in the small urban and rural areas, and we determine and analyze 12 strong individual subsidence bowls from Figure 10. To illustrate the spatial distribution of these subsidence bowls, a paraboloid interpolation is performed to generate a 3D model for the subsidence centers, as shown in Figure 13.
The widely distributed subsidence areas in Hebei province have posed a great threat to the safety of transportation and infrastructure; thus, the proper measures should continue to be taken by the local government to protect underground water resources. Moreover, based on the subsidence bowl distribution in Figure 13, we can make targeted protection policies by restricting and prohibiting groundwater exploitation in the most serious subsidence locations.
To better-validate the accuracy of the proposed method, we compare the estimated deformation results with GPS measurements. We exploit Continuous GPS coordinate time series from the China Earthquake Administration, and the GPS measurements are processed by GAMIT Software [41]. The locations of the three GNSS stations have been shown in Figure 8b with red dots, and we prepare the GPS daily measurements with the same time span as Sentinel-1 acquisitions. While two of the GPS measurements are available from 20180901 to 20200430, the other one only provides the measurements from 20180901 to 20190630. The GPS displacement time series have been transformed into the LOS direction, and both the GPS and InSAR time series are relative to the reference image at 20180912. The comparison between GPS measurements and InSAR results for three stations are shown in Figure 14a-c in red stars and black triangles, respectively. In addition, some of the InSAR results of the displacement time series are removed based on the residual phase variation due to the strong atmospheric effect, such as the images at 20190510 and 20190802.
shown the same trend with the groundwater level changes in the two places, which has proven that our deformation results are reasonable.
The major subsidence areas of Hebei province are located in the small urban and rural areas, and we determine and analyze 12 strong individual subsidence bowls from Figure 10. To illustrate the spatial distribution of these subsidence bowls, a paraboloid interpolation is performed to generate a 3D model for the subsidence centers, as shown in Figure 13. In Figure 14, we can see that the deformation results estimated by the proposed method have shown good agreement with GPS measurements, and the standard deviation between GNSS and InSAR time series is less than 0.8 cm. Thus, the comparison results have proven that the P-CSInSAR method can retrieve the deformation results for Sentinel-1 IW SAR images with very high precision.

Performance Analysis
The computation platform used in the experiment is an Intel Xeon Gold 6129 Processor with 64 cores, 256 GB memory, and 10 TB shared storage, accessible through a network file system with 1 GB/s bandwidth. The size of the input image data is approximately 0.9 TB, and the storage required for the total intermediate files and final results is more than 4 TB. The processing time for the entire process is approximately 41 h, with the three steps of the previous section accounting for 49%, 15%, and 36%, respectively. To further evaluate and quantify the performance of the parallel efficiency and the scalability of the proposed method, a speedup analysis is carried out for the different processing steps in the workflow with an increasing number of computing cores, which is displayed in Figure 15. The ideal achievable speedup (the ideal speedup value is equal to the number of cores) is shown in red squares, and the experimental speedup for the three steps is shown in blue squares in Figure 15a-c. Figure 15d shows the speedup for the overall processing chain.
GB/s bandwidth. The size of the input image data is approximately 0.9 TB, and the storage required for the total intermediate files and final results is more than 4 TB. The processing time for the entire process is approximately 41 h, with the three steps of the previous section accounting for 49%, 15%, and 36%, respectively. To further evaluate and quantify the performance of the parallel efficiency and the scalability of the proposed method, a speedup analysis is carried out for the different processing steps in the workflow with an increasing number of computing cores, which is displayed in Figure  15. The ideal achievable speedup (the ideal speedup value is equal to the number of cores) is shown in red squares, and the experimental speedup for the three steps is shown in blue squares in Figure  15a-c. Figure 15d shows the speedup for the overall processing chain. In Figure 15, we can see that the performance of the proposed method has enhanced greatly with increasing computing resources, but it has some gaps with the ideal linear speedup. For the Sentinel- In Figure 15, we can see that the performance of the proposed method has enhanced greatly with increasing computing resources, but it has some gaps with the ideal linear speedup. For the Sentinel-1 TOPS-mode image processing step in Figure 15a, the parallel computing efficiency is improved with an increasing number of cores based on burst-level parallelism. When the cores exceed the burst numbers, the algorithm efficiency shows very little improvement. Moreover, due to the large size of the Sentinel-1 image with the original resolution, the processing time has been slowed by intensive I/O operations. For the Sentinel-1 interferometry step in Figure 15b, the size of the multilook image is much smaller than that in the first step; thus, the algorithm efficiency can increase with a near-linear trend despite the massive I/O operations as the core numbers increased. In the time series analysis step, the block numbers in the data partition and the nonparallel computing parts (such as network integration) have greatly influenced the parallel efficiency, and we can see that the experimental speedup is increasing not as fast as the ideal speedup in Figure 15c. Finally, as seen in Figure 15d, the speedup for the overall processing chain demonstrates the efficiency and scalability of the proposed P-CSInSAR method.

Conclusions
Due to the increasing interest in large-scale InSAR applications in recent years, we present a novel parallel computing-based P-CSInSAR method to tackle the challenge of massive Sentinel-1 InSAR data processing.
In this work, we first introduce the conventional MT-InSAR algorithms, and then analyze their algorithm parallelizability for the whole processing chain. By adopting flexible parallelization strategies based on different data or algorithmic structures, P-CSInSAR has achieved high parallelization efficiency.
To validate the effectiveness and accuracy of the proposed method, P-CSInSAR has been tested on the real Sentinel-1 data over the North China Plain with three frames of Sentinel-1 images in the same track. Compared with the results over the second frame dataset by open-source software packages, the P-CSInSAR method can obtain the same deformation pattern as the other software, while it outperforms the latter on the processing time with 14 h relative to one week, which has reduced much of the time and human cost. Moreover, the results processed by P-CSInSAR for the three frames of Sentinel-1 data are in good agreement with the geodetic measurements, and the mean standard deviation of GNSS and InSAR time series are less than 0.8 cm. These experimental results demonstrate that P-CSInSAR has not only obtained deformation results with high accuracy, but also has high efficiency and scalability, which can provide practical and effective technical means for local governments in urban planning and disaster monitoring.
However, the proposed P-CSInSAR method cannot perform multi-task parallel processing, and has to finish different frames of Sentinel-1 data processing sequentially. Moreover, the parallelism of some algorithms is limited by data structures, and cannot achieve higher parallelization efficiency, such as the SAR image coregistration algorithm. Future work will migrate P-CSInSAR to cloud computing or supercomputing environments, and both multi-node and multi-thread programming will be employed to further improve the efficiency of large-scale Sentinel-1 InSAR processing. Some computationally intensive tasks in the P-CSInSAR method will be decomposed into subproblems and completed in different nodes by task scheduling. In addition, to better-exploit the computing resources of HPC, the processing steps of P-CSInSAR will be redesigned to reduce the I/O operation by eliminating the storage need for some intermediate results. This novel HPC-based InSAR big data processing platform will be applied in the routine monitoring widespread subsidence in the North China Plain and other national-scale geodynamic process investigations.