A Satellite Data Mining Approach Based on Self-Organized Maps for the Early Warning of Ground Settlements in Urban Areas

Featured Application: The proposed method is a neural-network-based tool for the early warning of ground settlement hazard in urban areas. On the basis of the analysis of MT-InSAR data through an unsupervised learning, the method can ﬁnd precursors of similar time-evolving phenomena. The method can be applied under different warning criteria and for different hyperparameters of the monitoring system. Abstract: Structural failure prevention is a crucial issue in civil engineering. The causes of structure or infrastructure collapse include phenomena that slowly deform the ground and could affect the stability of foundations such as differential settlements, subsidence, groundwater changes, slope failure, or landslides. When large urban areas need to be monitored, such phenomena are hard to be mapped by means of classical structural health monitoring methods due to the unaffordable quantity of in situ measurements these methods would entail. A very effective alternative is exploiting multitemporal interferometric synthetic aperture radar (MT-InSAR) displacement timeseries which would enable the monitoring of wide geographical areas over a weekly basis and extended spatial coverage. Analyzing the enormous amount of data produced by MT-InSAR may help to assess the time evolution of phenomena but can barely highlight “anomalous” ground deformations in time, to prevent likely structural failure. This paper proposes a method which analyzes the InSAR data through an unsupervised learning paradigm with the purpose of detecting critical events at their early stage. On the basis of similarities among time sequences, this method allows the ﬁnding of precursors of anomalous ground settlement behaviors, the correct framing of which should be directed to specialist evaluation and in situ inspections.


Introduction
Structural integrity and collapse prevention are very important aspects of civil engineering. A crucial goal in this field is to mitigate the risk of foundation downfalls which may entail loss of stability and even sudden failures of structures and infrastructures [1][2][3]. Different causes may trigger the progressive collapse of a building, the downfall of a bridge pile, or sinkholes. Ground subsidence [4], debris flow, excavation and tunneling activities [5], groundwater changes, slope failure, problematic foundation soils [6], bad interaction between soil and foundation, landslide due to volcanic activity or to pipes that burst, and liquefaction of soil after strong ground motions [7] are just some instances of phenomena that may affect the stability of structures and infrastructures.
Assessing the causes of structural failures is generally a hard task, which requires the knowledge of a great amount of information (i.e., structural characteristics, soil properties, soil-structure interaction, rainfall, and climatic data), hardly available during the monitoring phase. Regardless of the causes and the likely solutions of a given critical phenomenon, however, a preliminary monitoring stage is essential to prevent catastrophic events. For this purpose, a largescale continuous monitoring of time-evolving phenomena that might affect the stability of engineering facilities in urban areas would be very useful. Health monitoring of civil structures and infrastructures is typically conducted through in situ nondestructive techniques [8][9][10][11]. However, when very long structures or blocks of many buildings are to be monitored, classical methods may become overly expensive and difficult to apply in practice. In such cases, global positioning system (GPS) techniques can be profitably used [12,13], even combined with traditional local measurements [14] and remote sensing techniques [15].
Very effective remote sensing techniques are based on exploiting multitemporal interferometric synthetic aperture radar (MT-InSAR) displacement timeseries, which enable the monitoring of wide geographical areas over a weekly basis and extended spatial coverage. In fact, they are widely used for hazard assessment of very large or not accessible zones (due, for instance, to military conflicts or extreme climate conditions) [16,17]. The fields where MT-InSAR techniques can be applied are several, ranging from archaeological detection [18], glacier dynamic assessment [19], and environmental impact monitoring [20] to several civil engineering applications [21][22][23][24][25][26][27].
The potential of MT-InSAR techniques can be amplified by their combination with advanced methods of data analysis. The use of computational algorithms to analyze remote sensing data has been expanding for over half a century [28], since crop identification based on the Apollo satellite spectral data was produced in the 1970s [29]. More recently, neuralnetwork-based algorithms were used to analyze geological data for different purposes, including detection of hydrocarbon resources, pollution, and oil spills [30], classification of complex natural ecosystems [31,32], creating hazard survey maps in earthquake-prone areas [33], and assessment of some properties of soils [34,35]. In the civil engineering field, neural-network-based methods are currently used, for instance, to predict blast-induced ground vibrations [36], to control the behavior of buildings subjected to ground motion [37], to improve the features extraction under ultrasonic tests [38], or to optimize the structural design [39].
Neural-network-based algorithms have also been applied to analyze InSAR data in urban or suburban areas [25,[40][41][42]. The present paper aims to contribute to move toward an automatized method for monitoring ground settlements in urban areas, with the purpose of highlighting hazardous events at their early stage.
The proposed method is based on the combination of MT-InSAR techniques, which allow continuously collecting timeseries of surface deformations at a high-density grid of geographical points, and an artificial neural-network-based algorithm, which enables processing big data in real time, leading to an automatized alert system. On the basis of similarities between timeseries, the method aims to find precursors of likely ongoing critical surface deformations at their early stage. Empirical and analytical studies, in fact, showed that precursors may typically help to forecast some phenomena such as ground high stress in tunnels [43] or landslides [44]. The method here presented is based on the preliminary choice of a warning criterion that leads to finding the precursors of similar phenomena. To test the ability of the method, two warning criteria are assumed in the paper and, accordingly, two monitoring strategies are applied to a dataset of MT-InSAR timeseries relevant to a large urban area. It is to note, however, that the method can be applied under different warning criteria and monitoring strategies, with those considered in the present study being just two of the many conceivable. It is also to note that the considered dataset is only used to show the ability of the method to generate early-stage warnings accordingly to the assumed warning criterion and monitoring strategy. Any consideration about the geological phenomena detected while testing the efficacy of the method in the considered area is beyond the purpose of the present study.
Lastly, it should be mentioned that several early-warning systems have been developed in the last decades to forecast various kinds of hazards. In civil engineering, optimized structural health monitoring technologies and data processing techniques have been adopted for early warning in seismic-prone zones, [45,46]. The method presented in this paper fits into this line of research, its novelty lying in the capability to forecast events related to early-stage ground settlements in urban areas by analyzing InSAR data through an unsupervised learning paradigm.

Description of the Method
The warning system proposed in this work is based on a model-free approach. According to this paradigm, forecasting a critical event does not require an analytical model of the soil behavior, but rather the identification of similarities between series of measurements acquired at different times and locations. Basically, it is assumed that a critical event is preceded by precursors, which may lead to detecting an ongoing event at its early stage. Leveraging a proper metric in the space of the considered behaviors, the model-free approach is able to recognize such precursors. This type of approach has two main advantages over the conventional model-based approach. The first one is that it also allows identifying phenomena never studied before. The second advantage is that the recorded data are handled regardless of their physical meaning, reducing the need for preprocessing raw data.
A brief description of the neural techniques used to identify similarities between data series is firstly discussed in Section 2.1, while a detailed description of the proposed approach is provided in Section 2.2.

Clustering of Data
Artificial neural networks (ANNs) are mathematical tools based on the so-called "connectionist paradigm" [47,48], which stores information by means of the connections of a structure rather than through a set of memory cells. ANNs are made of networks of elementary processors linked together by weighted connections. This is, in fact, the typical paradigm adopted by the human brain, where the knowledge is built by processing data through the synapse network rather than by simply storing information in neurons. Although much less complex than the human brain, ANNs derive their name just from the paradigm similarity with it. Among the brain processes that ANNs can imitate, the most appealing is that of "learning" from experience, i.e., building functional relations on the basis of sets of learning examples.
A wide variety of techniques may be adopted to train ANNs, all of them being based on two essential elements: a training set and a training rule. The training rule establishes how the weights of the synapses should be modified to adapt the network to the training set. Since this rule mostly does not lead explicitly to a complete learning, iterative algorithms are usually needed. In general, ANNs can be based on supervised [49] or on unsupervised learning [50,51]. In the supervised learning, the neural network is exploited to reproduce a functional connection between input and output variables, the values of which are known for all the learning examples. A target value must be set for each learning example, which is taken as the expected output. Because the discrepancy between calculated and expected output values depends on the connection weights, the learning strategy tries to modify such weights in order to minimize the error. Should the learning process be successfully performed, the network can generalize the acquired information, being able even to associate outputs to inputs not belonging to the training set. The supervised learning has been the most adopted, and it is responsible for the widespread success that ANNs had in a lot of scientific fields.
However, neither target values nor strategies to minimize the errors are adopted by the human brain, the learning of which is in fact unsupervised. Inspired by the brain's learning strategy, the ANNs originally used unsupervised approaches, where the weights of the synapses continuously change in response to both external stresses (electrical impulses in the brain) and neuron replies. According to the information given by the inputs, the unsupervised learning can handle properties of a set of patterns and map its distribution in the input space. Analogies between different patterns can also be put in evidence, thus leading to a subdivision of a population into aggregates (clusters). The connection weights that are linked to the same neuron represent a prototype of the cluster, generally not coinciding with any example of the training set, even if it would have all the properties to be one of them. The learning process tries to make the prototype as similar as possible to the network input. One of the main activities of unsupervised ANNs is the clustering, which is usually based on a competitive learning (self-organized map, SOM) [51], where the different prototypes compete to win the learning step, i.e., the possibility to change their weights.
The most known rule is "the winner takes all" [52], which results in an iterative process of changing the prototype to make it to be more like the input and to increase such a likeness as much as possible. Thus, the distribution can be divided into clusters, each cluster being defined by a prototype which is the winner of the points belonging to that cluster. At the end of the process, each prototype tends to converge to the barycenter of the relevant cluster. Once the positions of the prototypes are stabilized, i.e., once the weights have converged to a stable value, they are frozen, and the training phase is concluded. At this point, the network can be used to perform a classification of new patterns, by associating them to one of the individuated clusters. The recall of the network is performed by feeding the patterns to be analyzed to the trained network.
It is worth noting that unsupervised learning can highlight properties of the distribution that are difficult to be estimated a priori, especially when the elements are defined by many variables. In the present study, the clustering process is exploited to find analogies between ground displacement behaviors occurring in different geographical locations and temporal occurrences. In particular, the analogy between behaviors preceding past anomalous events and similar actual behaviors can be exploited to generate alerts during the monitoring phase, calling for further in situ investigations.

Development of the Warning System
The proposed method entails the analysis of MT-InSAR timeseries to reveal analogies among reference behaviors that are assumed to be precursors of critical events, at different times and geographical positions. To this end, the capability of self-organizing maps to cluster the distribution of patterns on the basis of their similitude is exploited. In this method, the patterns are MT-InSAR displacement timeseries of preset duration. The historical data of the examined area are mapped by the self-organized map (SOM), and the clusters containing patterns that precede critical events are labeled as warning clusters. During the monitoring phase, the current timeseries acquired in the whole area are mapped by the trained SOM, and those falling within the warning clusters are taken to be further enquired by means of in situ inspections. It is to note in fact that the method assumes that similar critical events have similarly featured precursors. Although being reasonable and generally confirmed by the literature [43,44], this assumption is of course an empirical one, which may also lead to false warnings.
The criticalness of an event could be assessed on the basis of a feature of the timeseries (velocity, range, maximum/minimum value, etc.) or by referring to a reference record (an event occurred in the past). The assessment based on a feature of the timeseries (featurebased assessment) can be applied automatically to an available database of sequences, without the need for reference records. The assessment based on a reference record (recordbased assessment) carries out the monitoring of the area referring to a past event, requiring a preliminary analysis of the historical data by an expert.
Regardless of which criterion is adopted for stating the criticalness of an event, the observation time adopted in the analysis is crucial for the suitability of the method. In general, the phenomena involving soil movements may be characterized by different evolution in time. A too short observation time makes it difficult to detect precursors of warning trends, since the phenomena of interest may not yet have completely developed; on the other hand, a too long observation time may mask the warning trend. Therefore, the best approach is to run in parallel several monitoring systems with different observation times. The choice of the most suitable set of observation times may be led by the manager of the monitoring system.
The flowchart diagram of the warning system is depicted in Figure 1. An observation time is assumed to be initially preset. The diagram describes the procedures of both training (synthesis) and recall (monitoring). In the synthesis step, an SOM neural network is trained to cluster the timeseries on the basis of their similitude. In the monitoring step, the timeseries are fed to the trained SOM, which assigns each one to the corresponding cluster. An alarm is raised when a timeseries is assigned to a warning cluster. The synthesis and monitoring procedures are described below in detail.
Appl. Sci. 2022, 12, x 5 of 17 Regardless of which criterion is adopted for stating the criticalness of an event, the observation time adopted in the analysis is crucial for the suitability of the method. In general, the phenomena involving soil movements may be characterized by different evolution in time. A too short observation time makes it difficult to detect precursors of warning trends, since the phenomena of interest may not yet have completely developed; on the other hand, a too long observation time may mask the warning trend. Therefore, the best approach is to run in parallel several monitoring systems with different observation times. The choice of the most suitable set of observation times may be led by the manager of the monitoring system.
The flowchart diagram of the warning system is depicted in Figure 1. An observation time is assumed to be initially preset. The diagram describes the procedures of both training (synthesis) and recall (monitoring). In the synthesis step, an SOM neural network is trained to cluster the timeseries on the basis of their similitude. In the monitoring step, the timeseries are fed to the trained SOM, which assigns each one to the corresponding cluster. An alarm is raised when a timeseries is assigned to a warning cluster. The synthesis and monitoring procedures are described below in detail.

Synthesis
At first, the user decides whether to feed the network the entire database or a selected subset of sequences. Often, the whole database is very large, which would make it affordable to perform several training subsets with a growing number of elements, until the last added sequences do not significantly modify the resulting SOM. The final subset is used as the training set. The training is performed by means of a hierarchical clustering procedure ( Figure 2). At the root of the procedure, an SOM is trained on the whole training set, which is split into a set of clusters. The coordinates of the barycenter of the cluster are the connection weights linked to the corresponding output. The vector of such weights is referred to as a prototype. The training process moves the prototypes toward the barycenter of different clusters. In general, the patterns belonging to different clusters are differently scattered (see Figure 3). As a measure of the scattering of a cluster, the maximum difference between patterns and prototype is assumed.

Synthesis
At first, the user decides whether to feed the network the entire database or a selected subset of sequences. Often, the whole database is very large, which would make it affordable to perform several training subsets with a growing number of elements, until the last added sequences do not significantly modify the resulting SOM. The final subset is used as the training set. The training is performed by means of a hierarchical clustering procedure (Figure 2). At the root of the procedure, an SOM is trained on the whole training set, which is split into a set of clusters. The coordinates of the barycenter of the cluster are the connection weights linked to the corresponding output. The vector of such weights is referred to as a prototype. The training process moves the prototypes toward the barycenter of different clusters. In general, the patterns belonging to different clusters are differently scattered (see Figure 3). As a measure of the scattering of a cluster, the maximum difference between patterns and prototype is assumed. Appl  The aim of the hierarchical clustering procedure (see Figure 2) is to split the clusters until their scattering is lower than a preset threshold. To this purpose, clustering is preliminarily performed on the entire training set, and the obtained clusters are subdivided into three subsets, as follows: • Clusters A: the scattering is lower than the threshold; • Clusters B: the scattering is larger than the threshold; • Clusters C: empty.
Clusters A and their related prototypes are stored for successive assembly in the final network (assembling). Each Cluster B is reprocessed separately, giving rise, in general, to A, B, and C new clusters. Clusters C are eliminated. The procedure ends when no more Clusters B are found, such that only Clusters A remain. The prototypes of Clusters A are collected into a final unique SOM, which is the core of the monitoring system.
A preliminary choice of a warning criterion is required before entering the monitoring phase. The present method may be applied regardless of the warning criteria assumed. To assess the effectiveness of the method in the present work, two warning criteria are considered and used in the application example of Section 3. The first one is based on the incidence of patterns within the clusters, with the low-incidence clusters being labeled warning clusters. In fact, the rare occurrence of a behavior can be assumed as an index of  The aim of the hierarchical clustering procedure (see Figure 2) is to split the clusters until their scattering is lower than a preset threshold. To this purpose, clustering is preliminarily performed on the entire training set, and the obtained clusters are subdivided into three subsets, as follows: • Clusters A: the scattering is lower than the threshold; • Clusters B: the scattering is larger than the threshold; • Clusters C: empty.
Clusters A and their related prototypes are stored for successive assembly in the final network (assembling). Each Cluster B is reprocessed separately, giving rise, in general, to A, B, and C new clusters. Clusters C are eliminated. The procedure ends when no more Clusters B are found, such that only Clusters A remain. The prototypes of Clusters A are collected into a final unique SOM, which is the core of the monitoring system.
A preliminary choice of a warning criterion is required before entering the monitoring phase. The present method may be applied regardless of the warning criteria assumed. To assess the effectiveness of the method in the present work, two warning criteria are considered and used in the application example of Section 3. The first one is based on the incidence of patterns within the clusters, with the low-incidence clusters being labeled warning clusters. In fact, the rare occurrence of a behavior can be assumed as an index of The aim of the hierarchical clustering procedure (see Figure 2) is to split the clusters until their scattering is lower than a preset threshold. To this purpose, clustering is preliminarily performed on the entire training set, and the obtained clusters are subdivided into three subsets, as follows: • Clusters A: the scattering is lower than the threshold; • Clusters B: the scattering is larger than the threshold; • Clusters C: empty.
Clusters A and their related prototypes are stored for successive assembly in the final network (assembling). Each Cluster B is reprocessed separately, giving rise, in general, to A, B, and C new clusters. Clusters C are eliminated. The procedure ends when no more Clusters B are found, such that only Clusters A remain. The prototypes of Clusters A are collected into a final unique SOM, which is the core of the monitoring system.
A preliminary choice of a warning criterion is required before entering the monitoring phase. The present method may be applied regardless of the warning criteria assumed. To assess the effectiveness of the method in the present work, two warning criteria are considered and used in the application example of Section 3. The first one is based on the incidence of patterns within the clusters, with the low-incidence clusters being labeled warning clusters. In fact, the rare occurrence of a behavior can be assumed as an index of anomaly. The second criterion refers to a past noticeable event that labels as warning the cluster which it belongs to. The monitoring strategy based on the rare behavior criterion is referred to as Strategy 1, while that based on an event classified as noticeable is referred to as Strategy 2.
The labeling procedure of the clusters is performed before the monitoring phase. The labels are stored in the database.

Monitoring
The monitoring procedure retrieves the trained SOM network, the labels of the clusters, and the sequences acquired at the geographical points to be monitored. The SOM associates each sequence under test with one of the learned clusters, and, if the switched-on cluster is labeled as "warn", an alarm is generated.

Explanatory Application of the Method
As an example, the method described in the previous sections is applied to a database relevant to the urban area of Naples (Italy). A COSMO-SkyMed dataset [53] of MT-InSAR displacement timeseries (persistent scatterers [54]) relevant to 411,582 target geographical points, recorded from December 2008 to August 2014, was used as a work set. The dataset is a matrix of as many rows as the target geographical points, and 319 columns, collecting ID label, latitude, and longitude of each geographical point, as well as a sequence of 316 displacement samples. The raw data are processed to calculate the vertical component of the displacement by projecting the line of sight (LOS) [55]. Figure 4 shows the QGIS map of the whole set of sampling points. Some instances of displacement time sequences are shown in Figure 5 together with the satellite image highlighting the geographical points where the displacements were remotely recorded. anomaly. The second criterion refers to a past noticeable event that labels as warning the cluster which it belongs to. The monitoring strategy based on the rare behavior criterion is referred to as Strategy 1, while that based on an event classified as noticeable is referred to as Strategy 2.
The labeling procedure of the clusters is performed before the monitoring phase. The labels are stored in the database.

Monitoring
The monitoring procedure retrieves the trained SOM network, the labels of the clusters, and the sequences acquired at the geographical points to be monitored. The SOM associates each sequence under test with one of the learned clusters, and, if the switchedon cluster is labeled as "warn", an alarm is generated.

Explanatory Application of the Method
As an example, the method described in the previous sections is applied to a database relevant to the urban area of Naples (Italy). A COSMO-SkyMed dataset [53] of MT-InSAR displacement timeseries (persistent scatterers [54]) relevant to 411,582 target geographical points, recorded from December 2008 to August 2014, was used as a work set. The dataset is a matrix of as many rows as the target geographical points, and 319 columns, collecting ID label, latitude, and longitude of each geographical point, as well as a sequence of 316 displacement samples. The raw data are processed to calculate the vertical component of the displacement by projecting the line of sight (LOS) [55]. Figure 4 shows the QGIS map of the whole set of sampling points. Some instances of displacement time sequences are shown in Figure 5 together with the satellite image highlighting the geographical points where the displacements were remotely recorded.  An application of the proposed method to the dataset of Naples is described below. Section 3.1 illustrates the preprocessing of data and the training phase. Sections 3.2 and 3.3 provide instances of the monitoring done through Strategy 1 (rare behavior) and Strategy 2 (noticeable event), respectively (see Section 2.2.1).
To test the ability of the method to detect the early stage of critical events on the basis of similarities among timeseries, the monitoring procedure was applied with reference to past instances of the considered database.

Data Preprocessing and Training
Preprocessing is needed to allow the network to handle the timeseries. In general, the monitoring process should be carried out by choosing time windows consistent with the time constants that characterize the considered phenomena. Since the duration of all recorded sequences is usually much longer than the time windows, many instances are typically obtained from each sequence. Furthermore, a suitable limited number of instances must be extracted from the significant sequences of the whole database. Sequences are assumed to be significant when behaviors typical of the investigated phenomena occur within them.
The preprocessing procedure was here applied through the following steps: An application of the proposed method to the dataset of Naples is described below. Section 3.1 illustrates the preprocessing of data and the training phase. Sections 3.2 and 3.3 provide instances of the monitoring done through Strategy 1 (rare behavior) and Strategy 2 (noticeable event), respectively (see Section 2.2.1).
To test the ability of the method to detect the early stage of critical events on the basis of similarities among timeseries, the monitoring procedure was applied with reference to past instances of the considered database.

Data Preprocessing and Training
Preprocessing is needed to allow the network to handle the timeseries. In general, the monitoring process should be carried out by choosing time windows consistent with the time constants that characterize the considered phenomena. Since the duration of all recorded sequences is usually much longer than the time windows, many instances are typically obtained from each sequence. Furthermore, a suitable limited number of instances must be extracted from the significant sequences of the whole database. Sequences are assumed to be significant when behaviors typical of the investigated phenomena occur within them.
The preprocessing procedure was here applied through the following steps: i. Since the MT-InSAR timeseries are sampled with variable time steps, a preliminary linear spline was used to resample all the series, by adopting a 6 day timestep, which is also roughly referred to as a "week" hereafter. Overall, the duration of each timeseries was 305 samples, corresponding to about 61 months. ii.
On the basis of a ranking criterion, a training subset was extracted from the whole database. The displacement rate along the sequence was used as the ranking criterion. Only the timeseries overcoming a preset level of displacement rate were selected to train the SOM network. In the present example, 50,000 timeseries constituted the training set. In Figure 6, examples of a high-ranking and a lowranking series are shown. iii.
Time windows of preset duration were extracted from each timeseries. Considering L = 305 as the duration of a timeseries and N = 10 as the preset duration of the time window, (L − N + 1) = 296 time windows were extracted from each timeseries. In total, 7,400,000 time windows constituted the training set, while 121,827,976 time windows formed the whole dataset. iv.
A feature extraction was performed to reduce the sensitivity of the patterns to noise. The fast Fourier transform (FFT) was calculated for each time window, obtaining 10 complex frequencies and finally a pattern of 19 real components. Principal component analysis (PCA) was performed to reduce the size of the patterns. The final dimension of the patterns was 3.
Appl. Sci. 2022, 12, x 9 of 17 i. Since the MT-InSAR timeseries are sampled with variable time steps, a preliminary linear spline was used to resample all the series, by adopting a 6 day timestep, which is also roughly referred to as a "week" hereafter. Overall, the duration of each timeseries was 305 samples, corresponding to about 61 months. ii.
On the basis of a ranking criterion, a training subset was extracted from the whole database. The displacement rate along the sequence was used as the ranking criterion. Only the timeseries overcoming a preset level of displacement rate were selected to train the SOM network. In the present example, 50,000 timeseries constituted the training set. In Figure 6, examples of a high-ranking and a low-ranking series are shown. iii.
Time windows of preset duration were extracted from each timeseries. Considering L = 305 as the duration of a timeseries and N = 10 as the preset duration of the time window, (L − N + 1) = 296 time windows were extracted from each timeseries. In total, 7,400,000 time windows constituted the training set, while 121,827,976 time windows formed the whole dataset. iv.
A feature extraction was performed to reduce the sensitivity of the patterns to noise. The fast Fourier transform (FFT) was calculated for each time window, obtaining 10 complex frequencies and finally a pattern of 19 real components. Principal component analysis (PCA) was performed to reduce the size of the patterns. The final dimension of the patterns was 3.
An SOM was trained on the set of patterns by means of the hierarchical clustering procedure illustrated in Section 2.2.1. This procedure yielded a global SOM which could cluster the entire dataset of patterns. In total, we found 10,171 clusters. A histogram plotting the incidence of the occurrences in the clusters is shown in Figure 7.  An SOM was trained on the set of patterns by means of the hierarchical clustering procedure illustrated in Section 2.2.1. This procedure yielded a global SOM which could cluster the entire dataset of patterns. In total, we found 10,171 clusters. A histogram plotting the incidence of the occurrences in the clusters is shown in Figure 7.

Monitoring Strategy 1 (Rare Behavior)
The obtained SOM could then be applied to the whole database with the aim of detecting rare behaviors, according to the following steps: i.
Among the clusters obtained from the trained SOM, one of those to which few examples belong was chosen. This was assumed as an index of anomalous behavior, and then the patterns falling in this cluster were labeled as "warning". Cluster no. 291 having only one occurrence was set as the "reference cluster", as an example. Figure  8a plots the time diagram relevant to the point of the training set to which the window falling in the reference cluster 291 (in red) belonged. Such a window was expected to precede an anomalous event. ii.
To test the method, we assumed the 280th week of 305 as the monitoring instant, allowing an investigation of how the series evolved in the successive weeks. Monitoring considered the windows that closed at the monitoring instant. The Monitoring was applied to the geographical points of the whole dataset. iii.
The preprocessing applied to the training set (see point iv of Section 3.1) was also applied to the monitoring set, taking care to refer to the same parameters adopted in the training (average, ranges, normalization, principal components, etc.). iv.
The monitoring patterns were fed to the SOM, which assigned them to different clusters. The "warning" patterns falling into the reference cluster were given attention for a more in-depth study.
By applying this monitoring strategy, we found only one warning over 411,581 points. The time diagram relevant to this warning point is plotted in Figure 8b, where the red line indicates the monitoring window (which is assumed to precede the anomalous event). The points which Figure 8a,b refers to are shown in the map of Figure 8c, with this figure having just an illustrative purpose. It can be noted that Strategy 1 was able to put in evidence an anomalous phenomenon (a rather high vertical displacement which developed in a very short time) at its very early stage. It is to be stressed that the method is simply aimed at warning against possible anomalous behaviors at their early stage, while the most accurate study of them should be entrusted to specialistic assessments and in situ inspections.

Monitoring Strategy 1 (Rare Behavior)
The obtained SOM could then be applied to the whole database with the aim of detecting rare behaviors, according to the following steps: i.
Among the clusters obtained from the trained SOM, one of those to which few examples belong was chosen. This was assumed as an index of anomalous behavior, and then the patterns falling in this cluster were labeled as "warning". Cluster no. 291 having only one occurrence was set as the "reference cluster", as an example. Figure 8a plots the time diagram relevant to the point of the training set to which the window falling in the reference cluster 291 (in red) belonged. Such a window was expected to precede an anomalous event. ii.
To test the method, we assumed the 280th week of 305 as the monitoring instant, allowing an investigation of how the series evolved in the successive weeks. Monitoring considered the windows that closed at the monitoring instant. The Monitoring was applied to the geographical points of the whole dataset. iii.
The preprocessing applied to the training set (see point iv of Section 3.1) was also applied to the monitoring set, taking care to refer to the same parameters adopted in the training (average, ranges, normalization, principal components, etc.). iv.
The monitoring patterns were fed to the SOM, which assigned them to different clusters. The "warning" patterns falling into the reference cluster were given attention for a more in-depth study.

Monitoring Strategy 2 (Noticeable Event)
Strategy 2 followed the same steps of Strategy 1, apart from the choice of the warning cluster. In fact, in this case, a recorded noticeable event was chosen as the reference behavior, and the cluster which it belonged to was referred to as "warning". As an example, the event shown in Figure 9 characterized by a comparatively high displacement in a short time was taken as a noticeable behavior (the representative early-stage pattern is shown in red). The 216th week was assumed as the monitoring instant. In total, we found 11 warnings over 411,581 target points. Figure 10 displays the 11 warnings (black) and the noticeable behavior (blue). In red are plotted the monitoring windows, while the monitoring instant (216) is highlighted by a green vertical line. Figure 10 shows that all the points belonging to the warning cluster exhibited an anomalous behavior after the monitoring instant. This shows the ability of the proposed method to detect likely critical events at their early stage. The 11 warning points are shown in the map of Figure 11. It is to be stressed again that a more in-depth study of the geological nature of the detected phenomena is beyond the purpose of the present study. By applying this monitoring strategy, we found only one warning over 411,581 points. The time diagram relevant to this warning point is plotted in Figure 8b, where the red line indicates the monitoring window (which is assumed to precede the anomalous event). The points which Figure 8a,b refers to are shown in the map of Figure 8c, with this figure having just an illustrative purpose. It can be noted that Strategy 1 was able to put in evidence an anomalous phenomenon (a rather high vertical displacement which developed in a very short time) at its very early stage. It is to be stressed that the method is simply aimed at warning against possible anomalous behaviors at their early stage, while the most accurate study of them should be entrusted to specialistic assessments and in situ inspections.

Monitoring Strategy 2 (Noticeable Event)
Strategy 2 followed the same steps of Strategy 1, apart from the choice of the warning cluster. In fact, in this case, a recorded noticeable event was chosen as the reference behavior, and the cluster which it belonged to was referred to as "warning". As an example, the event shown in Figure 9 characterized by a comparatively high displacement in a short time was taken as a noticeable behavior (the representative early-stage pattern is shown in red). The 216th week was assumed as the monitoring instant. In total, we found 11 warnings over 411,581 target points. Figure 10 displays the 11 warnings (black) and the noticeable behavior (blue). In red are plotted the monitoring windows, while the monitoring instant (216) is highlighted by a green vertical line. Figure 10 shows that all the points belonging to the warning cluster exhibited an anomalous behavior after the monitoring instant. This shows the ability of the proposed method to detect likely critical events at their early stage. The 11 warning points are shown in the map of Figure 11. It is to be stressed again that a more in-depth study of the geological nature of the detected phenomena is beyond the purpose of the present study.

Discussion
On the basis of the applicative examples provided in Section 3, some comments are drawn on the proposed monitoring procedure. i.
Strategy 1 assumed that a rare behavior precedes a critical event, which is a strong assumption indeed. Figure 8 shows, however, that this strategy successfully found one warning point at the chosen monitoring time. It is interesting to note that, as in the reference case (Figure 8a), a large ground settlement also occurred after the precursor window in the warning point (Figure 8b). It can also be observed that the rarity of the reference behavior led the trained SOM network to find only one similar event, which is an index of the selectivity of the monitoring system. ii.
Strategy 2 assumes that analogous events have similar timeseries precursors. A reference window was chosen in this case on the basis of the anomalous trend of the timeseries that followed. This strategy highlighted 11 warning points, all of them experiencing a comparatively large displacement developed in a short time after the precursor window. Again, it is to note that a limited number of warning points were detected by the monitoring system. iii.
The similarity between timeseries depends on the monitoring hyperparameters (e.g., length of the time window, scattering threshold). On the other hand, the choice of the hyperparameters can be calibrated according to the phenomenon to be monitored, provided that the main characteristics of the latter are known. It is expected in fact that different initial development times characterize different geological phenomena.
In light of this, the monitoring system could even be carried out in parallel with different hyperparameters (e.g., the length of the time window) to forecast different geological events at their beginning. iv.
The early-warning procedure here proposed is flexible in the application under different monitoring strategies, as well as different hyperparameters. v.
In any case, in situ inspections are always needed to better check the situation in real time at the warning points under expert geotechnical guidance. vi.
An unsupervised learning of the artificial neural network may produce false or missed alarms. One of the purposes of the present monitoring procedure is to reduce both of these. To this purpose, the ability of the monitoring system can be improved

Discussion
On the basis of the applicative examples provided in Section 3, some comments are drawn on the proposed monitoring procedure. i.
Strategy 1 assumed that a rare behavior precedes a critical event, which is a strong assumption indeed. Figure 8 shows, however, that this strategy successfully found one warning point at the chosen monitoring time. It is interesting to note that, as in the reference case (Figure 8a), a large ground settlement also occurred after the precursor window in the warning point (Figure 8b). It can also be observed that the rarity of the reference behavior led the trained SOM network to find only one similar event, which is an index of the selectivity of the monitoring system. ii.
Strategy 2 assumes that analogous events have similar timeseries precursors. A reference window was chosen in this case on the basis of the anomalous trend of the timeseries that followed. This strategy highlighted 11 warning points, all of them experiencing a comparatively large displacement developed in a short time after the precursor window. Again, it is to note that a limited number of warning points were detected by the monitoring system. iii.
The similarity between timeseries depends on the monitoring hyperparameters (e.g., length of the time window, scattering threshold). On the other hand, the choice of the hyperparameters can be calibrated according to the phenomenon to be monitored, provided that the main characteristics of the latter are known. It is expected in fact that different initial development times characterize different geological phenomena. In light of this, the monitoring system could even be carried out in parallel with different hyperparameters (e.g., the length of the time window) to forecast different geological events at their beginning. iv.
The early-warning procedure here proposed is flexible in the application under different monitoring strategies, as well as different hyperparameters. v.
In any case, in situ inspections are always needed to better check the situation in real time at the warning points under expert geotechnical guidance. vi.
An unsupervised learning of the artificial neural network may produce false or missed alarms. One of the purposes of the present monitoring procedure is to reduce both of these. To this purpose, the ability of the monitoring system can be improved through different actions (e.g., increasing the number of clusters, suitably choosing the training set, and setting the length of the time windows), by exploiting any useful information taken from geological and historical studies in the considered area. In fact, as the monitoring system improves its knowledge, the number of detected anomalous or uncoded behaviors will be reduced. vii.
Despite showing a good ability to detect by similarity anomalous surface movements at their early stage, the procedure was tested in this preliminary study regardless of more strictly geological analyses or in situ inspections. Geological features, which are likely to affect even the choice of the training set, cannot be ignored when applying the monitoring procedure to real cases. viii.
The effectiveness of the proposed monitoring procedure is susceptible to some improvements: (i) considering patterns of geographical points relevant to urban sub-areas either in the training or in the monitoring phase to focus on specific phenomena; (ii) exploiting data relevant to multiple geographical areas to better calibrate the monitoring system; (iii) carrying out in parallel the procedure with different time window lengths and/or scattering thresholds, to forecast different kinds of events.
Lastly, some brief notes can be added on the advantages and limitations of the present monitoring system compared to other methods already available in the literature. Very powerful approaches based on cumulative displacement maps of InSAR data, have in fact been successfully used to monitor ongoing phenomena, which are typically in their overt phase [21,22]. The present method aims to be an early-stage monitoring system instead, able to forecast the incipit of potential ground instability phenomena.
On the other hand, the early-warning monitoring techniques available in the literature are generally aimed at predicting specific events for which an analytical or semi-empirical model is known. For example, the instant in which a landslide will occur was predicted in [56] through the inverse speed method, while the failure mechanism of a cataloged landslide event was studied in [57] to forecast the occurrence of similar events in different geographical areas according to some influencing factors. An attempt to predict the collapse of large infrastructures was also made in [28] using deformation outliers. These approaches were shown to be effective albeit dependent on the assumed physical model and focused on monitoring a particular phenomenon. Unlike these examples, the method here proposed is independent of any preliminary theoretical model of the involved geological phenomena. It generates an alarm on the basis of the analogy of the monitored timeseries with a reference timeseries, which has already been classified or chosen for the rarity of the event. In practice, the monitoring can be broad-spectrum and not limited to a specific type of event, while the analysis of the detected potentially critical events is assumed to take place after the alarm generation. Nevertheless, the method can even be calibrated to forecast specific geological phenomena (incipit of landslides, activation of faults, foundation settlements due to human underground activities or to the variation of groundwater levels) by duly setting the monitoring criteria and hyperparameters.
It is worth noting that the proposed method is still at a preliminary stage of development and its potential can be highlighted only after its application to real cases. In any case, even at an advantaged stage of development and application, the method is not autonomous since it requires a post-monitoring phase with in situ inspections. The abovementioned techniques available in the literature can be helpfully combined with the present method for a more comprehensive monitoring.

Conclusions
This paper presented a method to monitor ground surface displacements in urban areas with the purpose of forecasting anomalous behaviors from their early occurrence. Through training an SOM neural network, which highlights similarities among recorded timeseries, the method analyzes datasets of MT-InSAR timeseries recorded at hundreds of thousands of geographical points and provides warning signals according to a preset alert criterion. The method applies an unsupervised learning to yield a hierarchical clustering of the training set. An SOM is so obtained, representing the core of the monitoring system, which is able to highlight ground movements preluding the development of critical events. The method was tested on a dataset of InSAR timeseries relevant to a large Italian urban area. By way of example, two different strategies of monitoring were applied, one based on rare behaviors and the other one referring to a recorded noticeable event. The results obtained for both strategies show the capability of the method to highlight ground settlement phenomena at their early stage. This may lead to an automatized system of structural health monitoring in urban areas.
It should be stressed that, although realistic, the timeseries dataset and the warning criteria here adopted have only an explanatory intent, while the practical application of the method is expected to be carried out with the advice of geotechnical experts. Different critical soil phenomena could be, in fact, put in evidence by the present method, provided that the hyperparameters of the monitoring system are suitably set. Investigating in-depth on such phenomena is, however, outside of the scope of the present study, the main aim of which was to provide a general-purpose tool for the early warning of the foundation-related safety risk of civil structures and infrastructures in urban areas.
Although being model-independent and broad-spectrum, the present method is expected to be successful even in detecting the starting stage of specific geological phenomena, provided that it is duly calibrated. Calibration of the system and improvement of its selectivity (to avoid false or missed alarms) could be achieved through some strategical actions, for instance, referring to localized patterns of geographical points, considering multiple geographical areas to extract the training set, or carrying out the monitoring system in parallel with different warning criteria and hyperparameters. These issues will be addressed in future lines of research.