Ground Subsidence Susceptibility (GSS) Mapping in Grosseto Plain (Tuscany, Italy) Based on Satellite InSAR Data Using Frequency Ratio and Fuzzy Logic

This study aimed at evaluating and mapping Ground Subsidence Susceptibility (GSS) in the Grosseto plain (Tuscany Region, Italy) by exploiting multi-temporal satellite InSAR data and by applying two parallel approaches; a bivariate statistical analysis (Frequency Ratio) and a mathematical probabilistic model (Fuzzy Logic operator). The Grosseto plain experienced subsidence and sinkholes due to natural causes in the past and it is still suffering slow-moving ground lowering. Five conditioning subsidence-related factors were selected and managed in a GIS environment through an overlay pixel-by-pixel analysis. Firstly, multi-temporal ground subsidence inventory maps were prepared in the study area by starting from two inventories referred to distinct temporal intervals (2003–2009 and 2014–2019) derived from Persistent Scatterers Interferometry (PSI) data of ENVISAT and SENTINEL-1 satellites. Then, the susceptibility modelling was performed through the Frequency Ratio (FR) and Fuzzy Logic (FL) approaches. These analyses led to slightly different scenarios which were compared and discussed. Results show that flat areas on alluvial and colluvial deposits with thick sedimentary cover (higher than 20 m) on the bedrock in the central and eastern sectors of the plain are the most susceptible to land subsidence. The obtained FRand FL-based GSS maps were finally validated with a ROC (Receiver Operating Characteristic) analysis, in order to estimate the overall performance of the models. The AUC (Area Under Curve) values of ROC analysis of the FR model were higher than the ones of FL model, suggesting that the former is a better and more appropriate predictor for subsidence susceptibility analysis in the study area. In conclusion, GSS maps provided a qualitative overview of the subsidence scenarios and may be helpful to predict and preliminarily identify high-risk areas for environmental local authorities and decision makers in charge of land use planning in the study area. Finally, the presented methodologies to derive GSS maps are easily reproducible and could also be applied and tested in other test sites worldwide, in order to check the modeling performance in different environmental settings.


Introduction
Ground subsidence is a geohazard that consists in a time-dependent lowering deformation of the land surface, which can be caused by various natural and anthropogenic factors [1][2][3]. Land subsidence that results from natural causes slowly proceeds due to sediment compaction under the pressure of overlying sediments or dissolution of carbonatic rocks, tectonic motions or isostatic adjustments [4][5][6]. Man-made activities can trigger or worsen subsidence processes through new settlement constructions evolution of land subsidence, but few works used them as input data for the inventory of subsiding areas in order to produce subsidence susceptibility maps [32,33].
Another important issue is the choice and management of the environmental conditioning factors related to ground subsidence: a geo-database in land subsidence modelling must cover various types of thematic information, such as geological and hydrological factors [26,29]. The use of GIS as the basic analysis software tool to integrate all the collected spatial data with varying spatial resolutions is nowadays essential. It allows for the collection and management of predisposing factors, presenting them as thematic layers suitable for spatial analysis. Moreover, GIS can handle these data for geospatial applications and zonal statistics computations required for susceptibility performance.
The study area of this work is the Grosseto plain located in the Tuscany region in Italy. The area is threatened by land subsidence, which has been recorded by space-borne InSAR data since 1992, through ERS and ENVISAT satellites, up to nowadays through the SENTINEL-1 constellation [34][35][36][37]. Furthermore, in January 1999, a huge circular sinkhole suddenly occurred in the plain in the Bottegone site and the causes of this phenomenon were found in the natural geological and structural setting of the area [38]. The deep limestone bedrock covered by unconsolidated alluvial or basin-fill deposits, as well as the presence of fault systems that act as preferential underground thermal water flow lines, make the area characterized by karst terrain features and thus susceptible to sinkholes and to natural ground subsidence [39].
In this work we exploit satellite SAR images, acquired by two sensors (ENVISAT and SENTINEL-1) in two different time spans (2003-2009 and 2014-2019) and processed through PSI technique, to provide multi-temporal subsidence inventory maps as the input data for susceptibility analysis of the area. Then, we investigate the performance of two geospatial model-based approaches, i.e., a bivariate statistical model, a Frequency Ratio (FR) method, and a mathematical probabilistic model-Fuzzy Logic (FL) operator -for the prediction of land subsidence due to natural causes in the area.
The FR method was chosen as a fast and precise statistical approach in terms of success rate with good predictive ability-especially in low occurrence areas [5,40]. It is a robust model that can be applied very simply [41]. The FL method was chosen as an expert knowledge-based approach that avoids linear monotonic relationship between susceptibility and causal factors assumed by bivariate statistical data-driven models [42]. In addition, in fuzzy operators the process is mathematically simple and understandable, despite the absence of a systematic and effective design method [5].
Thus, in this study we generated the so called Ground Subsidence Susceptibility (GSS) maps in the Grosseto plain by using past and recent ground motion data and by applying the two approaches mentioned above. We obtained slightly different scenarios which were compared, discussed and validated. The reliability and prediction accuracy of all the models were assessed by the Receiver Operating Characteristic (ROC). Furthermore, the GSS maps were cross-combined with the initial subsidence areas inventory. The specific objective of this work is to explore the two different susceptibility models of subsidence on the Grosseto plain for providing an effective qualitative overview of the subsidence scenarios, which could be helpful for identifying high-risk areas for environmental local authorities in charge of land use planning in the study area.

Study Area
The study area, which extended up to 503 km 2 , is located in the Grosseto coastal plain (southern Tuscany, Italy), which is one of the largest alluvial plains in Tuscany, where the Ombrone and Bruna rivers stream towards the Tyrrenian Sea (Figure 1a). The area is fairly flat and is characterized by a relatively small delta and a wide dune-belt [43]. The Grosseto plain corresponds to a tectonic depression in Triassic sedimentary rocks filled by a neoautoctonous thickness of terrigenous deposits [39].
The outcropping units mainly belong to the Paleozoic basement (sandstones/cemented gravels and quartzites of "Verrucano" Formation) of the Tuscan Metamorphic Units in the reliefs northward Grosseto and overlaying evaporatic and carbonate sediments (Triassic to Cretaceous age) of Tuscan units (i.e., carbonate breccias of "Cavernous Limestone" Formation). There are strips of Ligurian Units and Sub-Ligurian Units somewhere (i.e., Cretaceous "Argille a Palombini" Formation). Finally, the neo-autochthonous deposits and recent continental sediments close the stratigraphic sequence [44,45] ( Figure 1b).

Figure 1.
Overview of the study area: (a) location of the study area; (b) lithological setting of the study area, modified from [45]); (c) photo of the sinkhole event in Bottegone site in 1999, modified from [38]).
In particular, the rocks formations surrounding the plain mainly belong to the Tuscan Complex (Triassic-Oligocene) consisting in limestones, sandstones, quartzites, siltstones and cherts [46,47]. The floodplain loose sediments (Neogene-Quaternary) of the plain mainly consist of silts, clays with gravels and sands [44].
The sedimentary evolution of the Grosseto plain during the Quaternary has been described using stratigraphic information derived from wells and subsurface data collected in hydrogeologic studies [48,49]. During the Late Pleistocene, fluvial and aeolian sediments were deposited, which now crop out in low terraces along the borders of the alluvial plain [48,50]. The Ombrone and Bruna rivers cut two valleys into the Pleistocene deposits during the last glacial interval. The Holocene evolution of the alluvial plain was the result of the marine and river deposits due to the interaction between eustatic sea-level rise, tectonic movements and clastic discharge of the Ombrone River. Sediment filling mainly took place during historical times: in the Roman period, the north-western part of the plain was a lagoon and became a marsh during the Middle Age. The sands of the coastal dunes, whose innermost ridge is related to a tombolo which is most likely of Etruscan age, continue inland under the landfill for 2 to 3 km, and reach the thickness of 30 m [48].
In the northern part of the Grosseto alluvial plain a sinkhole characterized by sub-circular shape of about 140 m diameter occurred in January 1999 in an agricultural area called the Bottegone site ( Figure 1c). The maximum depth of the collapse was 15 m and the estimated involved volume was about 250,000 m 3 . According to studies [38,51], from a genetic standpoint, the sinkhole has a natural and deep origin due to the collapse of a karst void in the carbonate substratum [52,53]. The terrain fall was determined by the failure of the surface silty-clayey soil roof, due to an underlying cave generated through the dissolution of deep carbonate rocks.
Investigations on the Bottegone sinkhole highlighted that the main driven factors of the phenomenon were the high depth of the carbonate bedrock (belonging to the "Cavernous Limestone" formation) located at about 250 m underground and the tectonic asset of the area characterized by NNW-SSE-oriented fault systems that are dipping towards the Tyrrhenian Sea and move downward the bedrock. These faults would have also contributed to the dissolution that led to the karst cave creation by acting as preferential flow lines of thermal groundwater, which is found in the area due to the presence of some springs in the north-eastern portion of the plain. These springs are mainly located at the foot reliefs that border the plain or at the lithotype contact between permeable alluvial lithotypes and clayey sediments [38,51,54,55].
Concerning hydrogeological information about the study area, several wells are homogenously located over the whole field plain and they are active for groundwater withdrawals for irrigation purposes, especially during the summer period, while there are no wells with high water-flow rate for drinking purposes. Data from literature state that most part of the wells are shallow, with depth lower than 30 m and they do not cause relevant piezometric variations [51].

Methodology and Data
The workflow of the entire methodology employed in this work is visually represented in Figure 2. The first stage in the GSS mapping was the detection of ground subsidence area inventory from which the relevant factors were extracted as input data for the susceptibility analysis. We exploited Persistent Scatterers InSAR data acquired by ENVISAT and SENTINEL-1 sensors in two different time spans (2003-2009 and 2014-2019) to provide multi-temporal ground subsidence inventories in the study area. The polygonal areas of subsidence inventory were mapped to grid cells of 100 m and then randomly divided into a training set (70%) to be used to provide ground subsidence susceptibility maps and a validation set (30%) to be used to validate the predicted susceptibility maps.
According to existing literature, i.e., [17], and to the analysis on the occurred phenomena in the study area, i.e., [38], in this work we adopted five land subsidence conditioning factors that include lithology, land use, cover thickness, aquifer units and distance to faults. These conditioning factors (hereafter named as factors X) were classified in a suitable number of classes e and converted in cartographic layers in GIS environment as raster maps with 10 m cell size.
The management of factors X for GSS mapping was carried out throughout an overlay pixel-by-pixel analysis. To do it, the study area was converted in a 100 m grid cell size fishnet and we derived the raster value of each factor X for each pixel cell. The input value of each pixel was given to a point as pixel centroid and then transferred to the corresponding pixel of the fishnet. By intersecting the polygonal areas of each subsidence inventory, we obtained a fishnet subset which contained the subsiding pixel cells and that was randomly split into the training and validation set. Then, by using the training set of the above spatial database managed through pixel to pixel analysis, subsidence susceptibility assessments were performed through two separate approaches, i.e., Frequency Ratio (FR) and Fuzzy Logic (FL) models, which integrate the spatial relationship between ground subsidence locations and subsidence-related factors X. The FR involves the analysis of the ratio of the area where subsidence occurred to the total area for each class (e) of each factor X, while the FL uses the fuzzy set theory by applying membership functions (µ) to each factor X and by using the Gamma operator to combine them.
The outputs consisted in four GSS maps, respectively obtained by separately exploiting the past and the recent subsidence inventory (2003-2009 and 2014-2019) through the two different approaches (FR and FL).
The maps were finally validated with ROC analysis by assessing the success rate through the use of training dataset and the prediction rate curve using the validation dataset.

Ground Subsidence Area Inventory
Multi-temporal ground subsidence inventory maps in the study area were designed by means of space-borne InSAR data acquired by satellites in two different temporal spans.
We exploited InSAR PSI data derived from ENVISAT images in ascending and descending orbits in the spanning time 2003-2009 and from SENTINEL-1 images in ascending and descending orbits covering the period 2014-2019. ENVISAT data were processed with the PSInSAR technique [56] and obtained from the National Cartographic Portal of the project PST-A "Piano Straordinario di Telerilevamento" of the Italian Ministry for the Environment, Land and Sea [57], whereas Sentinel-1 data were processed through the SqueeSAR technique [58] and obtained from the web portal of the project "Monitoring ground deformation in Tuscany Region with satellite radar data" [36] funded by the Tuscany Region authority.
The PSInSAR technique is capable of measuring displacements with millimeter precision on radar targets retrieved within SAR images, the so-called Persistent Scatterers (PS), which can be anthropic (buildings, metallic structures) or natural objects (rock outcrops) characterized by a stable phase signal [56]. This technique turns out to be particularly suitable in urban/peri-urban areas which include many potential coherent PS such as buildings and road lines.
The SqueeSAR technique is an evolution of PSInSAR [58]. The SqueeSAR technique overcomes some limits of PSInSAR by extracting information not only from point-wise objects (i.e., PS), but also from Distributed Scatterers (DS). The latter ones are homogeneous areas detected over a group of pixels in the SAR image and characterized by moderate coherence. DS correspond to pasture, shrubs and bare soils that do not have the same high signal-to-noise ratios as PS, but they are still discernible from the background noise in SAR images elaboration. As a result, SqueeSAR algorithm, patented in 2011, jointly processes PS and DS, and make the density of radar targets higher-especially in rural and non-urban areas.
For SAR sensor onboard ENVISAT satellite, the look angle varies across the swath between 19 • and 27 • and can be assumed as constant value of about 23 • for the datasets on the Grosseto plain in both ascending and descending orbits. For SENTINEL-1 data acquired on Tuscany area, the look angle is about 36 and 40 • , respectively, in ascending and descending orbit. The main characteristics of the employed PSI datasets are reported in Table 1.
The velocities of ascending and descending geometries measured along the satellite Line Of Sight (LOS) of both sensors were merged because the look angles are similar and the motion components measured on ascending and descending directions are concordant; both being negative (moving away from the sensor). Thus, we assumed that the ground motions occurring over the plain and measured by the satellite along the LOS are almost purely vertical downward movements [34,36]. This choice allowed us to retrieve a higher amount of radar benchmarks in the study area, as the combination of ascending and descending data to derive east-west horizontal and vertical velocity components through vector-based methods existing in literature [4] would have led to a smaller number of PS points. Furthermore, in order to only consider vertical rates on the Grosseto plain, all the PS of opposite orbits were clipped on flat areas by removing radar targets located on slopes with a gradient higher than 5 • and then merged. In Figures 3a and 4a the spatial distributions of ENVISAT and SENTINEL-1 PSI mean annual velocities are shown. The green targets represent stable radar benchmarks that record LOS velocity within a threshold range around zero of ± 2 mm/yr; this stability threshold was chosen as fitting the standard deviation value (σ) of the PSI populations and already tested in literature for subsidence PSI-based analysis [4,10]. The maximum subsidence rate observed is 16   Since the PSI density revealed in both ENVISAT and SENTINEL-1 datasets is quite low (Table 1 and Figures 3a and 4a, respectively) and since the aim was to derive a polygonal inventory of most subsiding areas over the plain, a spatial interpolation of PS InSAR data was performed. In particular, we interpolated the point-wise velocity values throughout the Inverse Distance Weighted (IDW) spatial interpolator in ArcGIS software [59]. We used a quadratic weighting power of 2 within a variable search radius neighborhood and 12 as the number of points for calculating the interpolated velocity, and we obtained interpolated IDW velocity raster maps with 10 m cell-size. To create the subsidence areas inventories, we considered significant velocities higher than 4 mm/yr (in absolute value)-which is a value corresponding to 2σ-discarding lower values, finally contour lining the IDW velocity

Conditioning Factors
Five conditioning factors were chosen as mostly subsidence-related factors in the study area: lithology, land use, cover thickness, aquifer units and distance to tectonic lineaments. Cover thickness and distance to fault were continuous data, whereas lithology, land use and aquifer units were non-linear and categorical data ( Table 2). Each factor was classified in a suitable number of classes and converted in raster map layers with 10 m cell size.  The lithological data were extracted from 1:10,000 scale geological map distributed by the Tuscany Region authority [45]. Lithotypes were grouped in eight classes described above in Section 2.
The land use layer was derived from CORINE Land Cover map [60]. Soil types were distinguished in nine classes: grass and pasture; scrubs and herbaceous; urban fabric; semi-natural/natural areas; crops; bare soil and coastal dunes; water, swamp, river; vegetation; olive groves, vineyards and orchards.
The cover thickness layer was derived from the Water Protection and Management Plan of Tuscany Region authority [61]. The thickness of the sedimentary cover was classified in five classes (10 m <; 10-20 m; 20-35 m; 35-50 m; > 50 m) and two other classes were assigned respectively to 'no data' areas and to 'uncovered' areas. The thickest sedimentary covers (higher than 50 m) were found in the central sector of the plain on the left hydrographic side of the Bruna river and southwards on the hydrographic right side of the Ombrone river.
The aquifer unit layer was derived from the web portal of the Italian Institute for Environmental Protection and Research (ISPRA) [62]. A number of five classes of hydrogeological complexes were distinguished depending on overall primary and secondary permeability and interstitial porosity: "impermeable rock complex" in marls and clays; "crystalline rock complex" corresponding to aquifers in igneous and metamorphic rocks; "calcareous rock complex" corresponding to carbonate aquifers in mountainous areas; "sand complex" in sandstones and silicatic sediments; "basin fill alluvial complexes" corresponding to aquifers in the tertiary deposits located within the Grosseto floodplain.
The distance to the fault layer was derived from the spatial distribution of the main tectonic lineaments in the area, which is a considerable factor in the occurrence of land subsidence as testified by the Bottegone sinkhole event. Data on fault locations were taken from available literature [38] and from the Italian ITHACA (ITaly HAzard from CApable faults) database that collected all available information on active tectonic structures in Italy. By using the Euclidean distance in the ArcGIS Spatial Analyst tool and buffering from fault lines in kilometers, the output map was divided into five classes through the natural breaks classification [63]: 0-1; 1-2.5; 2.5-4; 4-6; and 6-10.

Data Overlay Analysis
In this step, we performed data preparation and storage of the subsidence-related conditioning factors by means of an overlay analysis in the GIS environment. The study area was gridded into a total number of 50,553 pixel cells with 100 m size. We appended values from each multi-class map layers of conditioning factors to each pixel cell of the fishnet. Throughout the intersection with subsidence area inventories, there were 2540 cells in the fishnet of ground subsidence areas in 2003-2009 and 3172 cells in 2014-2019. These subsiding cells of the two inventories were randomly partitioned into a 70% training set (1778 pixel cells in 2003-2009 and 2220 pixel cells in 2014-2019) and a 30% validation set. The training dataset was used for calibrating the subsidence susceptibility model, whereas the validation dataset was used to check the model performance as well as to confirm its accuracy (prediction power) [5].

Frequency Ratio (FR)
The FR model is a quantitative technique that defines the correlation between the phenomenon and the factors related to it in a spatial perspective; it consists in the computation of the ratio of occurrence to nonoccurrence probability for a specific phenomenon [41]. In this work, a Frequency Ratio (FR) model was used to examine subsidence susceptibility as it derives a quantified correlation between the subsidence area inventory and the conditioning factors. The relation analysis is the ratio of the area where subsidence occurred, with respect to the total area. In particular, FR was exploited by extracting the ratio of the percentage of subsidence occurrence pixels in each class e of conditioning factor X to the percentage of total pixels in that class. The relationship can be expressed as follows: where "FR X " is the frequency ratio of each factor X and represents the ratio of each causative factor influencing the subsidence occurrence; "Npixels subs in class e" is the number of pixels in class e containing subsidence area; "Npixels subs tot" is the total number of pixels where subsidence occurs in the whole study area; "Npixels in class e" is the number of pixels of class e in the whole study area; "Npixels tot" is the total number of pixels in the whole study area. A FR value which is higher than 1 means that the conditioning factor has a high correlation with subsidence; therefore, it indicates a greater spatial susceptibility due to stronger correlation. Otherwise, a lower value (FR lower than 1) indicates a weaker correlation with land subsidence occurrence and thus a lower susceptibility. In order to calculate the ground subsidence susceptibility, each individual conditioning factor was reclassified as based on the FR weights, and then all the factors were summed.
For our case study, the gridded study area was converted into five raster maps of conditioning factors X in order to obtain five layers with a fixed number of cells (5022272 cells). Once the FR values of classes e have been calculated, each raster map of predisposing factors was re-created on the basis of the FR values through the GIS Lookup tool. Finally, the GSS FR map was produced by the summation of each factor ratio value by Equation (2): where "FR X " is the reclassified conditioning factor X based on the FR weights computed with Equation (1).

Fuzzy Logic (FL)
The Fuzzy Logic (FL) approach is an indirect heuristic method based on fuzzy theory set, which relies on non-discrete mathematical formula used for handling several complex problems, due to its easy implementation [64,65].
Fuzzy logic considers the spatial objects on a map as members of a set and works with the concept of partial truth. In the classical set theory, an object is a member of a set if its membership value is 1, otherwise it is 0; so, the membership of a variable is defined on a binary scale as 1 = true or 0 = false. In a fuzzy set, instead, the membership can take any value between 0 and 1, as it is expressed on a continuous scale from 1 = full membership to 0 = full non-membership. The fuzzy set theory uses the idea of a membership function (MF) that expresses the degree of membership with respect to a variable [27].
The fuzzy set membership functions of the variables can be chosen according to knowledge-based weightage and following some defined curve functions. The membership function curve (hereafter indicated as µ) describes the degree of belonging of each data to each fuzzy set and takes a value between 0 and 1. The µ function curve represents the data values of the variable on the x-bar and the membership µ value on the y-bar, and can follow different function types. By simplifying, three basic curves commonly used to describe the type of fuzzy membership functions are the Bell-shaped curve, Z-shaped curve, and S-shaped curve. Then, according to the shape of the curve and to the positions of the inflection points, each type of curve can be sigmoidal, j-shaped or linear [63] (Figure 5). If the relationship between the value and the fuzzy membership does not follow any of the above functions, a user-defined fuzzy membership function can be used. For combining the fuzzy membership functions, several fuzzy operators (e.g., namely 'fuzzy AND', 'fuzzy OR', 'fuzzy algebraic PRODUCT', 'fuzzy algebraic SUM' and 'fuzzy GAMMA') can be used [65][66][67].
These operations allow integrating the fuzzy membership values of different variables by working with a cell-by-cell overlay. The 'fuzzy AND' overlay returns the minimum value of the sets the cell location belongs to. The 'fuzzy OR' overlay returns the maximum value of the sets the cell location belongs to. The 'fuzzy PRODUCT' overlay multiplies each of the fuzzy values, for each cell, for all the input variables. The 'fuzzy SUM' overlay adds the fuzzy values of each set the cell location belongs to. It is a complementary operator to the fuzzy algebraic product and the resulting sum is an increasing linear combination function. The 'fuzzy GAMMA" is an algebraic product of 'fuzzy PRODUCT' and 'fuzzy SUM', which are both raised to the power of γ (gamma) [67]. The general function is as follows: This operator is a compromise between the increasing tendencies of the fuzzy algebraic sum and the decreasing effect of the fuzzy algebraic product. The parameter γ is chosen between 0 and 1 and permits to control the decreasing or increasing tendency: when γ = 0 the combination equals the 'fuzzy algebraic product', when γ = 1 the combination equals the 'fuzzy algebraic sum'.
In ground susceptibility mapping, fuzzy logic defines the conditioning factors as members of a set by allowing different degrees of membership from 1, expressing the highest susceptibility, to 0, expressing no susceptibility of subsidence.
In our case study, the ground subsidence susceptibility map was evaluated on a cell-by-cell basis, by combining the extracted µ of the five predisposing factors through an inference technique (fuzzy inference) developed under fuzzy logic.
First, predisposing factor values for each pixel cell from the raster GIS layer with a fixed number of cells (5022272 cells) were derived.
Second, the relationship between susceptibility and each predisposing factor was assessed for each pixel cell, by applying the corresponding fuzzy membership curve to the value of the predisposing factor at the cell. To do so, the five subsidence-related conditioning factors were standardized from their original values to values in a range of 0 to 1 within µ function, by means of a fuzzification procedure performed with the Fuzzy Membership tool in GIS. The function curve represents the e class values of the predisposing factor on the x-bar and the membership µ value on the y-bar, and thus describes how ground susceptibility varies with regard to changes in a predisposing factor. Following the chosen functions, the five fuzzy index maps representing all the conditioning factors were generated.
Third, the overall GSS susceptibility map was computed for every pixel by aggregating the five fuzzified maps of the individual predisposing factors throughout the 'fuzzy GAMMA' operator with the Fuzzy Overlay command in GIS.
The fuzzy inference process for a given pixel cell can be represented using a general formula: where GSS ij is the susceptibility value at the cell (i, j); X is the predisposing factors, V is the aggregating overlay operator for calculating the overall susceptibility for the cell (i, j); µ x is the membership function describing the relationship between susceptibility and the predisposing factor X; e ij,x is the class value of the X-th predisposing factor at location (i, j).

Validation
Validation is the final step of modelling susceptibility maps for testing the effectiveness, the precision and the worth of the used methodology.
In this work, the validation was mainly carried out by means of the ROC analysis, firstly through the generation of the prediction rate and success rate curves. The ROC curves are two-dimensional graphs that are useful for organizing classifiers and visualizing their performance [68]. The prediction rate curve was created by using the subsidence areas of the validation data set (30% of the total pixels of the whole initial subsidence area inventory) that was not employed for the susceptibility map generation. It consisted in plotting the cumulative percentages of the subsidence areas of the validation data on the y-bar and the cumulative percentages of subsidence susceptibility areas arranged from the highest to the lowest value on the x-bar. The success rate curve was created in the same way, but, using the subsidence areas of the training data set instead. Since the success rate method used the training pixels that have already been used for building the susceptibility map, the success rate is not suitable for assessing the prediction capability of the models, but it permits to verify how well the resulting GSS maps have classified the areas of existing subsidence.
Secondly, the area under the ROC curves (abbreviated AUC, [68]) of both success and prediction rate curves was computed. The AUC value is normally used for evaluating the precision of the susceptibility method throughout an overall measure of predictiveness and it ranges from 0.5-represented by the diagonal line, meaning an inaccurate random model-up to 1.0, representing an ideal perfectly fitting model [69].
Thirdly, in order to finally evaluate the effectiveness of the adopted models, the produced GSS susceptibility maps were also cross-compared with the distribution of all the pixels of existing subsidence areas of the two inventories mapped in 2003-2009 and 2014-2019 in the study area. For this purpose, the two GSS maps were separately overlaid with the two initial subsidence inventory maps, and the ground subsidence occurrence percentages in all susceptibility classes for both maps were determined.

Results
By exploiting the spatial database and the methodologies described above, the assessment and mapping of GSS referred to the periods 2003-2009 and 2014-2019 were carried out through the use of both past and recent subsiding areas inventories within both frequency ratio and fuzzy logic analysis.

GSS FR Mapping (Analysis by FR)
Using the formulas of Equations (1) and (2), the FR value for each constituent class e of the five conditional factors X was computed. The computations of FR values are included in Table A1.
The frequency ratio indicates the spatial relation between the subsidence area and each of the five conditioning factors and reveals how this correlation is strong. For example, the lithology factor is classified in eight classes, among which alluvial deposits (class 1) comprise 34,127 pixels, corresponding to 67% of the total pixels number across the entire study area (50,553 pixels). In class 1, an amount of 1724 pixels corresponding to a percentage of 97% are located within the subsidence area inventory detected with ENVISAT data in 2003-2009 (1778 total subsiding pixels). The derived FR value for class 1 is 1.4, which is the highest value within this class and means a high relation of this lithotype with the subsidence occurrence area.
The land use factor shows the crops and bare soils/quarries as the most conditioning classes for subsidence in the study area, revealing FR values of 1.5 and 2.2, respectively, while the other land classes have a low impact on the ground subsidence distribution.
For the cover thickness, the occurrences of subsidence increase with the increase of thickness of sedimentary layers over the bedrock: the FR values vary between 3.7 and 5.6 for thick sedimentary coverages higher than 20 m, more likely to be subsiding than thinner coverages. Class 7, corresponding to cover thickness >50 m, achieved the maximum weight of 5.6 FR value.
Among the aquifer units, the basin fill alluvial complexes have the highest FR value (1.3) while the other hydrogeological complexes have the lowest association (0.0) with the subsiding pixels.
The distance to faults must have a significant influence on the spatial distribution of natural ground subsidence in the study area, as stated in literature, due to thermal water circulation in karst lithotypes [38]. Table A2 shows that areas with short distance (<1 km, class 1) and high distance (<5.8 km, class 5) from the fault show a lower FR scores with respect to the central classes: areas within 2.4-4 km from the fault (class 3) revealed to be the most prone for the land subsidence with a frequency ratio value of 1.8.
The resulting GSS FR map was classified and grouped by using the equal area approach into five classes for visual interpretation [27]: 'very high', 'high', 'moderate', 'low' and 'very low'. The map (Figure 6a) shows that the zones of 'very high' and 'high' susceptibility (red and orange colour classes, respectively) are located over the central and South-Eastern part of the plain. Flat areas on loose material of quaternary alluvial and colluvial deposits with sedimentary thickness over the bedrock higher than 20 m are the most prone to ground subsidence. 'Moderate' susceptibility values (yellow colour) are located around these areas throughout the flood plain with no sedimentary cover or lower thickness (10-20 m); 'low' susceptibility values (green) are located along the coast and in areas with high relief characterized by outcropping aeolian coastal sandy and silty deposits or conglomeratic rocks. 'Very low' susceptibility (dark green) class is located on vegetated relief areas where crystalline rocks crop out.
This general pattern is largely due to the spatial pattern of the conditioning factors-particularly lithology and sedimentary cover thickness-which are recognized as two important predisposing factors for natural ground subsidence.

GSS FL Mapping (Analysis by FL)
Based on expert opinion and statistical features, different fuzzy memberships have been assigned for each of the five conditioning factors of the subsidence in the study area.
In particular, the fuzzy membership values were chosen according to subjective judgment and expert knowledge and we chose the µ function, which was the best fit for the values distribution, according to the spatial extent of the subsidence pixels. The different choices of µ are shown in Table A1. These choices were taken on the basis of the data type of the predisposing factor, the spatial distribution of subsidence pixels and on knowledge-based expert evaluation, as explained below: • The membership function of lithology defined the first two classes (alluvial and colluvial deposits) as the most susceptible units and it was applied as a user-defined function, since any defined curve could be assumed. The fuzzy membership values were assigned in the range of 0.8, meaning the highest susceptibility to class 1 and 2 (extreme value as 1.0 in the assignation of fuzzy values has been left out) and 0.0, meaning that a pixel is at the lowest susceptibility.

•
The membership function assigned to land use factor followed a bell-shaped Gaussian curve with a linear shape due to the categorical-type of data with nominal classes. The highest fuzzy membership value of 1 (midpoint of the function) was assigned to class 5 (cropland) since 98% of the subsidence pixels in the study area falls into this land use class. The maximum spread value of 1 was chosen, since it results in a steeper distribution around the midpoint.
• The membership function of the cover thickness factor was a S-shaped curve, since it followed a sinusoidal directly proportional function, meaning the greater the value of the class factor, the greater susceptibility. The shape was chosen as sigmoidal since the factor is continuous-type grid data, with larger class values (class 5-7) having a membership closer to 1, midpoint of the range of values as 4 and the spread as 5 corresponding to the steepest distribution from the midpoint.

•
The membership function of the aquifer unit factor followed an user-defined function where the highest value (0.8) was assigned to class 5, as more than 98% of the subsidence pixels in the study area falls into this class, and the lowest value (0.0) was assigned to the other classes, as no subsiding pixels were recognized within them.

•
The membership function of the distance to fault factor followed a bell-shaped sigmoidal curve because the central classes showed the highest amount of subsiding pixels and the data type is continuous. The midpoint of the function was centered at class 3, with spread value of 0.1, meaning a medium steepness of the curve.
After factor maps have been weighted and "fuzzified" according to these chosen membership functions, they were combined using the fuzzy gamma approach, resulting in the final GSS FL map; different gamma values were tested to reach a bigger reliability of the fuzzy gamma operator. The fuzzy gamma approach with the value 0.7 was finally used since it is an intermediate value among the gamma range values and seemed to be a good compromise between the increasing and the decreasing tendencies effects of the fuzzy gamma approach.
The GSS FL map was classified and grouped into five classes ('very high', 'high', 'moderate', 'low', 'very low') for visual interpretation as GSS FR , by using the equal area approach in GIS environment. The map in Figure 6b provides similar results compared to the GSS FR map. The areas of 'very high' and 'high' susceptibility are located over the central and south-eastern part as in GSS FR , but they have a lower spatial extent as red and orange colour class are less widely distributed over the plain. Overall, susceptibility zonation of GSS FL seems to be more spatially approximated then in GSS FR -especially in lower susceptibility classes. This is a consequence of the FL model, which is a knowledge-driven approach, compared to the data-driven approaches like FR.

Validation of GSS FR and GSS FL Maps 2003-2009
The obtained FR-and FL-based subsidence susceptibility maps were validated with a standard ROC analysis, in order to estimate the overall performance of the models in the study area.
The success rate curve and the prediction rate curve, based respectively on the training and validation datasets, were drawn and the area under curve (AUC) value was calculated for both GSS FR and GSS FL maps (Figure 6a,b).
Regarding the GSS FR map, the AUC value of the success rate was 0.93, indicating a significant goodness-of-fit and the AUC value of the prediction rate was 0.94, demonstrating a reasonably accurate predictive ability of the model. The ROC validation of GSS FL map reveals slightly lower AUC values, i.e., 0.92 and 0.93 for the success rate and prediction rate, respectively.
Then, the classified GSS susceptibility maps were matched with the whole existing subsidence areas inventory map. By using this approach, amount and percentages of subsidence occurrence pixels falling into the susceptibility classes of each map were determined. The results of this analysis are presented as pie charts in the right portion of Figure 6a,b. In these graphs, it is clearly seen that no subsidence pixel, or very few ones, fall into the 'very low' susceptibility class in both the GSS FR and GSS FL maps. In the GSS FR map, 71% of the pixels are correctly classified as pixels in subsidence ("pixels subs" in Figure 6a,b) in the class 'very high' of susceptibility. The percentages of 24%, 4% and 1% of pixels in subsidence fall respectively into the 'low', 'moderate' and 'high' susceptibility classes in the GSS FR map; whereas, the percentages of 68%, 20%, 12% and 2% fall into the 'very high', 'high', 'moderate' and 'low' susceptibility classes in the GSS FL map.
In this regard, it is easy to say that most part of subsidence areas in the study area are compatible with the susceptibility classes in the GSS maps and that the two models provide similar results in this verification analysis.  Table A2.

GSS
Results are similar to GSS FR mapping referred to in the 2003-2009 period. Concerning the lithology factor, as expected, the maximum frequency ratio value (1.4) has been noticed for the alluvial deposits, followed by colluvial/landfill deposits (0.6), and by sandy/silty deposits (0.1). Regarding the land use, the subsidence spatial extent is wider in areas classified as "bare soils, dunes, quarries" and in croplands, resulting in FR values of 2.7 and 1.5, respectively. With respect to the relationship between cover thickness and subsidence probability, the FR values increase where cover thickness increases, and reach up values of 3.8 and 3.9 for class 6, corresponding to a sedimentary coverage with a thickness of 25 to 50 m and class 7 corresponding to a thickness higher than 50 m, respectively. Concerning the aquifers, the highest FR values have been assigned to unconsolidated clastic complexes (class 4 and 5) that lead to a higher water storage and volumetric compressibility and thus higher potential subsidence. The FR computations for the classes of the distance to fault factor reveal a pattern similar to that computed for GSS map 2003-2009 and included in Table A1 (maximum value assigned to class 5)-but they show slightly lower values.
The resulting GSS FR map was classified by equal area into five classes of susceptibility ('very high', 'high', 'moderate', 'low', 'very low'), in agreement with previously generated GSS maps.
The map (Figure 7a) presents similar results comparable to the GSS map referred to the past interval. The 'very low' and 'low' classes of susceptibility together comprise 47% of the total study area, the 'moderate' and 'high' classes comprise 17% and 20%, respectively, while the 'very high' class covers 16% of the whole area. In particular, the south-western part of the Grosseto plain turns out to be less susceptible in 2013-2019 as the 'very high' susceptibility class is less extended up in this sector with respect to the spatial distribution of GSS map referred to in 2003-2009.  Table A2 and are almost the same to the ones for the previous GSS FL displayed in Table A2. The "fuzzified" factor maps have been combined using the fuzzy gamma operator with the value 0.7.
The subsidence susceptibility map GSS FL has been classified into five classes by equal areas method by analogy with the other GSS maps and shows some similarities with the map simulated with FR (Figure 7b). The 'very low' and 'low' susceptibility classes represent 35% and 7% of the total study area, respectively, and they correspond to the ridge and coastal areas. The 'moderate' class is less represented (22%) than in GSS FR ; 'very high' and 'high' susceptibility classes include the 19% and 17%, respectively, of the whole area and cover the central and south-eastern sectors of the plain.

Validation of GSS FR and GSS FL Maps 2014-2019
The validation of the FR-and FL-based subsidence susceptibility maps was performed with the ROC analysis. The AUC values of the GSS FR are 0.94 and 0.93 for the success-rate and the prediction-rate curves, respectively, which demonstrates the overall exactness and prediction accuracy of the susceptibility mapping and predictive modelling. Regarding the GSS FL , the AUC values from the success-rate and from the prediction-rate curves are 0.91 and 0.93, respectively, suggesting that the used model is a fair indictor of subsidence mapping and a good prediction tool.
The amounts and percentages of total pixels in subsidence within each susceptibility class of GSS FR and GSS FL were also computed. The two verifications show very similar results with 57% or 58% of the pixels, respectively, classified in the 'very high' class of susceptibility and 31% in the "high" class (Figure 7a,b).

Discussion
The work consisted of four main steps: (i) the construction of multi-temporal subsidence area inventory maps; (ii) the collection and overlay of conditioning subsidence-related factors; (iii) the ground subsidence susceptibility (GSS) mapping through FR and FL models; and (iv) a validation step for the different produced GSS maps.
Ground subsidence areas must be identified and mapped accurately for preparing an inventory map in the study area as an essential starting point for a susceptibility analysis [16]. The use of remote sensing InSAR data for providing subsidence inventory map is a suitable source of data collection to be used as basis for subsequent models since it does not require fieldwork that may be very intensive, time-consuming and expensive over large areas, or even impossible in some areas with limited accessibility [42]. Moreover, the satellite InSAR technique allows measuring the range and rate of land subsidence with millimetric accuracy and frequent easy repeatability across time, which is not possible through any other traditional technique. In this work we spatially interpolate PSI data by means of IDW approach. This interpolation technique was chosen as it is a simple, flexible and quick deterministic method, based on spatial autocorrelation (closer values are more related than farther ones). It can be easily performed in GIS by setting suitable parameters: in our work, the number of known points used to calculate the interpolated velocity was set as 10 within a variable search radius neighborhood. A quadratic weighting power of 2 was set up, so that high velocity values were sharpened and not averaged out too much. Nonetheless the power number was not set higher because spatial velocity values varies quite gradually with not many localized peaks on the study area. A drawback of this kind of weighted distance interpolation is the bull's-eye effect, which is an artifact coming from isolated point data. The usage of a more advanced interpolation method, such as Kriging, would be a way out on this facet, but it would require more assumptions on statistical properties of the input data.
The whole work was implemented in a GIS environment. In particular, ArcGIS software was used for database construction, coordinate conversion, grid production, overlay analysis as well as for susceptibility modeling.
The choice of the conditioning factors for the analysis was based on information and data over the area derived from literature, in particular from studies on specific events, e.g., the Bottegone sinkhole which occurred in 1999. Natural causes for the observed subsidence in the area were assumed, and the five most relevant conditioning factors were taken into account: lithology, land use, cover thickness over the bedrock, aquifer complex type and distance from faults.
The overlay analysis of the conditioning factors was performed on a fishnet of 100 m cell size. This spatial resolution seems suitable because this choice was done for faster computation reasons, as well as for the type of investigated phenomenon, since land subsidence typically covers relatively large areas.
The FR and FL models were used in parallel and compared to generate GSS map in the study area. The FR is a bivariate statistical model that helps model the study area physically and is widely used due to its ease, relatively high precision, and compatibility with GIS [70]. In FR, the weighting values are estimated according to the distribution of land subsidence events in the inventory map and their relation to the class of each conditioning factor. In this way, the FR model provides an easier and robust way to ascertain the importance of each class of each causative factor on the occurring event.
The FL can be seen as a new way to express probability and is specifically designed to deal with imprecision because fuzzy represents membership in vaguely defined sets, and not the likelihood or condition [71]. Hence, the values to show the membership degree of the classes of conditioning factors are chosen on a semi-quantitative fuzzy weighting model. Basically, the approach is based on expert opinion on the environmental setting of the study area. An advantage of the fuzzy logic model is that the model allows more flexible combinations of the weighted maps and can be used at any scale of analysis [41,71]. The fuzzy sets can be edited very fast, so it is very easy to try various scenarios, to add or remove new data layers to the model with GIS software and to minimize the effects of over-estimated membership values. Differently from FR, the basic concept of the FL model is that the relationships between predisposing factors and susceptibility for a certain study area are obtained from subjective judgment, and then these relationships are applied to an evaluation of the susceptibility at each location in the study area [42]. This means that all the classes are judged by subject opinion, that is based on expert-knowledge of the area, but potentially be error prone as well [67]. Overall, this distinguishes the expert knowledge-driven approaches such as FL from the statistical data-driven approaches such as FR for susceptibility mapping. With the former, the relationships are approximated logically by the expert-knowledge of domain area, while with the latter, the relationships are statistically approximated-based on past recorded occurrences [42]. Once the GSS map is generated, various approaches for susceptibility map classification exist and are available in GIS software such as natural breaks, equal interval, equal area, geometrical interval, quantile, and standard deviation. They can be used on the basis of specific area conditions and distribution of the data population. For example, the natural break is a classification scheme that can be used when there is a wide jump in data values [5,72]. Equal interval, equal area or standard deviation classification methods highlight one class relative to others and are proper approaches when the data are close to normal distribution. The quantile classification takes into consideration widely different values into the same class and it is more applicable for positive or negative skewness of data [73]. Accordingly, in this study, different classifications were evaluated and tested to produce the susceptibility map visualization with high conformity with the actual environmental conditions, and finally the equal area method was selected as the best result providing the lowest degree of exaggeration. The same approach was already successfully used in other similar works, e.g., [27]. Moreover, the equal area approach, in which each category represents a similar proportional area, was chosen because of the intent to examine the difference in subsidence area density among the different classes based on susceptibility during the validation phase.
Overall, the past and recent GSS maps, respectively referring to 2003-2009 and 2014-2019, show similar results (Figures 6 and 7). In both periods, the areas most prone to subsidence classified as 'very high' and 'high' susceptibility are found to be located in the central and south-eastern part of the Grosseto plain. These zones correspond to flat areas on loose alluvial and colluvial deposits with sedimentary thickness over the bedrock higher than 20 m and are characterized by an aquifer in compressible unconsolidated clastic complexes. 'Moderate' susceptibility values border the rest of the plain with lower thickness of sedimentary sequence. 'Low' and 'very low' susceptibility values are located over the relief areas where crystalline rocks outcrop.
This general pattern is mainly due to the spatial distribution of some of the predisposing factors, particularly lithology and cover thickness, which are recognized to be the two most significant intrinsic natural causes of subsidence in the study area.
The similarity of the two GSS maps confirms that subsidence occurrence has not changed much across one decade and hence it could likely be induced by natural causes rather than by anthropic causes. Additionally, a larger and recent amount of data on groundwater withdrawal over the study area could be useful in future research for better investigating this issue in subsidence occurrence. Moreover, the GSS maps were produced on wide scale as a qualitative overview, so further studies would need to be carried out at site-specific level if more detailed data and in-depth studies were required.
The subsidence susceptibility maps produced by the Frequency Ratio method and by the Fuzzy Logic rules (GSS FR and GSS FL ) were compared in mapping outputs and in validation results.
The comparative spatial distributions of susceptibility classes highlighted that the GSS FL map shows wider extreme susceptibility classes ('very high' and 'very low') with respect to GSS FR . This facet is due to the work of fuzzy logic operator, which tends to simulate a more "binary" susceptibility map with high and null susceptibility classes.
Concerning the validation performed through ROC analysis, the areas under the prediction-rate curve were used to assess and compare the prediction capabilities of the models, whereas the areas under the success-rate curve allowed to determine the goodness of the used models in classifying the existing subsidence areas in the GSS map. The AUC results showed that FR and FL models have good predictive powers and suitable reliability as all values range from 0.80 to 1.00. The efficiency of both methods were quantitatively compared. Results of reliability (success-rate, using the training dataset) of models concluded that the FR had the highest value of AUC in comparison to the FL. Also, results of predictive power (prediction-rate, using the validation dataset) depict that the FR algorithm had the highest value of AUC and thus a better prediction capacity with respect to the FL approach.
The GSS maps provided in this research are acceptable for the employed scale of analysis. Although the acquired results have showed the efficiency of both the models for ground subsidence in the study area, the Frequency Ratio approach has however proved to be more effective than the Fuzzy Logic approach for subsidence susceptibility assessment in terms of prediction accuracy. On the other hand, FR as a data-driven approach strictly depends on the areas inventory as well as on the choice of the training dataset, and thus all the problems or inaccuracies of the initial inventory map are automatically inherited to the final susceptibility map. Hence, the training process could be implemented to avoid over-fitting models and, as a possible future perspective, the statistical model could be merged with an expert-based model in a "hybrid, ensemble approach", which could provide a better performance than the individual models.

Conclusions
This study was aimed at assessing and mapping the Ground Subsidence Susceptibility (the so-called GSS) in the Grosseto plain (Tuscany Region, Italy) by starting from two inventories referred to distinct temporal intervals (2003-2009 and 2014-2019) derived from InSAR data of ENVISAT and SENTINEL-1 satellites. Two parallel approaches, i.e., a bivariate statistical analysis (Frequency Ratio, FR) and a mathematical probabilistic model (Fuzzy Logic operator, FL) were applied and compared.
Five conditioning factors as subsidence-related natural causes in the study area were identified, overlaid and managed within a pixel-by-pixel analysis as layers in a GIS environment. The susceptibility modelling performed through the two approaches led to slightly different scenarios that were compared, discussed and validated. The reliability and prediction accuracy of all the models were assessed by ROC analysis and by the cross-comparison of pixels in subsidence from the initial inventory with each susceptibility class.
All of the resulting GSS maps show that the central and eastern parts of the plain are the most susceptible to ground subsidence. These sectors correspond to flat areas on alluvial and colluvial deposits with thick sedimentary cover on the bedrock (higher than 20 m). Hence, the spatial pattern of ground subsidence susceptibility, which is almost the same in the two temporal intervals, is mainly due to the specific spatial distribution of lithology and cover thickness that are recognized to be the two most relevant natural factors of subsidence in the study area.
The AUC values of ROC analysis of the FR model were a little higher than the ones of FL model, making it evident that the former was a better predictor and a more appropriate method for subsidence susceptibility analysis in the study area.
The predictions are essential for environmental planning and managing in order to control and reduce impacts of ground subsidence phenomena. GSS mapping has been proven as a valuable strategy to predict and preliminarily identify high-risk areas of ground lowering due to natural causes. In the Grosseto plain, which experienced subsidence and sinkholes, and which is still suffering slow-moving lowering of the ground, the results of this work could be a useful product for environmental planning managers. Hence, GSS maps provided a qualitative overview of the subsidence scenarios that may be helpful for environmental local authorities in charge of land use planning in the study area.
The presented methodologies could be also reproduced and tested in other study areas in order to check the modeling performance in different environmental settings. Funding: This research was funded by the Regional government of Tuscany, under the agreement of "Monitoring ground deformation in the Tuscany Region with satellite radar data" between the Earth Sciences Department of the University of Firenze (Italy), the Tuscany Region and the Italian Department of Civil Protection.
Acknowledgments: ENVISAT PSI data were obtained within the National Cartographic Portal of the project Pst-A "Piano Straordinario di Telerilevamento" [57]. Sentinel-1 data were processed by TRE ALTAMIRA and obtained within the portal from the project "Monitoring ground deformation in Tuscany Region with satellite radar data" funded by the Tuscany Region authority [36].

Conflicts of Interest:
The authors declare no conflict of interest.