Comparing Fuzzy Sets and Random Sets to Model the Uncertainty of Fuzzy Shorelines

: This paper addresses uncertainty modelling of shorelines by comparing fuzzy sets and random sets. Both methods quantify extensional uncertainty of shorelines extracted from remote sensing images. Two datasets were tested: pan-sharpened Pleiades with four bands ( Pleiades ) and pan-sharpened Pleiades stacked with elevation data as the ﬁfth band ( Pleiades + DTM ). Both fuzzy sets and random sets model the spatial extent of shoreline including its uncertainty. Fuzzy sets represent shorelines as a margin determined by upper and lower thresholds and their uncertainty as confusion indices. They do not consider randomness. Random sets ﬁt the mixed Gaussian model to the image histogram. It represents shorelines as a transition zone between water and non-water . Their extensional uncertainty is assessed by the covering function. The results show that fuzzy sets and random sets resulted in shorelines that were closely similar. Kappa ( κ ) values were slightly different and McNemar’s test showed high p -values indicating a similar accuracy. Inclusion of the DTM (digital terrain model) improved the classiﬁcation results, especially for roofs, inundated houses and inundated land. The shoreline model using Pleiades + DTM performed better than that of using Pleiades only, when using either fuzzy sets or random sets. It achieved κ values above 80%.


Introduction
Remote sensing offers a practical and economical means for coastal research.A series of remote sensing images can be used, for example, for mapping the dynamics of wet grassland and vegetation patches [1], mapping depth and water quality [2], coastal erosion [3], and in particular shoreline mapping [4][5][6].Instantaneous shoreline locations extracted from remote sensing images have become popular since mapping shorelines using ground survey and photogrammetry is costly.Several methods have been proposed, for example, using manual digitization [7], spectral indices extraction such as water and vegetation indices [8], active contour segmentation [6], band ratios [9], and image classification [10,11].Most of these methods are based on hard classifications, and only a few considered soft classifications in the context of shoreline mapping [4,5,12].A hard classifier allocates a pixel to one class only based on the highest similarity.Therefore, applying hard classification for shoreline mapping could be misleading, since a shoreline is defined as the interface between land and water surfaces with its position changing over time.As images only capture a shoreline at a particular instant, they convey various kinds of uncertainties.Riesch [13] mentioned that uncertainties may be inherent in the system or can arise from incomplete knowledge.This first type of uncertainty is classified as errors [14] or as indeterminate boundaries [15,16].When a shoreline is clearly identified, the errors or the kind of indeterminate boundaries may arise, for example, during data processing and measurements.
Meanwhile, the second type of uncertainty is divided into vagueness and ambiguity [14].A vague boundary inherently belongs to the nature of the shorelines, i.e., it is hardly possible to define the extent of shoreline objects such as coast land, water and their gradual transition.The ambiguity may arise owing to the difference in classification system and perception of shorelines.
A common approach to model the uncertainty of objects is based upon probability theory [17].For example, the epsilon band [18] is applied to model positional uncertainties of geographical objects.In addition, random sets theory is applied to handle the uncertainty in spatial information, for example for the definition of geographical areas, in mathematical morphology and in geostatistics [19].Fuzzy sets theory introduced by Zadeh provides a conceptual framework for solving representation and classification problems in an ambiguous environment by means of membership functions [20].
In this study, we focused on the similarity of fuzzy sets and random sets in modelling the uncertainty in shoreline locations.Fuzzy sets are sets or classes that allow partial memberships [21,22].A fuzzy set is characterized by a membership function which assigns to each object a grade of membership in the range [0,1], with 0 representing the "non-membership" and 1 representing the "full-membership" of the set.Two ways are commonly distinguished to develop this membership function: the semantic import model (SIM) derived from expert knowledge, and the fuzzy c-means classifier (FCM).SIM is subjective in nature [23] since it is based on subjective perceptions of vague categories rather than on data in the given problem, i.e., by extending the crisp boundaries into a transition zone [24].In contrast, FCM is obtained from a set of attribute data and results in an objective approach.It is a commonly used method to estimate the membership values.FCM was developed by Dunn [25] and generalized by Bezdek [26].Fuzzy sets theory has been widely used in remote sensing i.e., for image classification [5,[27][28][29].Fuzzy sets were applied also in GIS, e.g., for developing spatial data models for vague objects and their topological relation [23,[30][31][32].
A random set is a generalization of a random variable taking subsets as values.Random sets theory is an inherent part of probability theory [33][34][35].We can estimate the probabilities whether a random set is included in a given set, i.e., core, support and α-level sets or not [19].Random sets theory has been employed to develop image segmentation methods [36], to characterize varying geometrical shapes [37] and to quantify the extensional uncertainty of spatial objects such as road polygons [38] and wetlands [1].
The connection between fuzzy sets and random sets has been discussed in the past [39][40][41][42].Random sets theory is a methodology to deal with the uncertainty of outcomes of random phenomena.Fuzzy sets theory describes the uncertainty associated with classification or the placement of an outcome in a given class due to imprecision [42].Goodman et al. [17] stated that fuzzy sets are equivalent to a weak specification of random sets.Moreover, Zadeh argued that probability theory must be used together with fuzzy logic to enhance its effectiveness and both theories are complimentary rather than competitive [39].Fuzzy sets and random sets can be related via the one-point covering function of random sets, defined as the probability that an element is covered by a random set.The membership function of fuzzy sets is then considered as the probability of a random set covering a point [17,43,44].
The objective of this research is to compare the performance of fuzzy sets and random sets in shoreline mapping.In this case, water and non-water pixels were used as proxies to determine the shoreline features extracted from remote sensing images.The comparison between both methods is implemented using two types of images: original Pleiades and the combination of Pleiades with an airborne LIDAR altimetry data.

Study Area
The study was conducted in an area situated along the north coast of Central Java, Indonesia (Figure 1  The study site has an average slope of less than 5.0% and elevation of less than 5.0 m above mean sea level (MSL).From 30-year tidal data, the average tidal range of this location was 1.0 m (−0.5 to +0.5 m) [45].The area has a mixed semi-diurnal tide with two high tides and two low tides with various heights.Tidal floods occur regularly in line with tidal cycles.Land subsidence with a rate approximately 6.0 cm year −1 [46] aggravates the severity of the flooding.Furthermore, the threat becomes even higher by the rise of sea level in Indonesia with an average of 0.6 cm year −1 [47].
The site is a typical tidal area with a high density of rivers.In the past, it included extensive fishponds and rice fields [48] and the rivers were used for irrigations purposes.Rural settlements are found along the riverbanks or adjacent to the shorelines.Four villages, Bedono, Sriwulan, Sidogemah and Purwosari, are located in the study area, with population sizes equal to approximately 3500, 12,500, 7000 and 6300 inhabitants, respectively [49].
Since 1990s the productive fishponds and rice fields have been submerged and abandoned as swamp areas [50,51] leading to shoreline changes [4].The area is prone to frequent inundations at high tide [52,53].Several measures have been taken to minimize their impact, for example creation of dykes along the drainage system near the settlements, dredging the drainage channels, elevating roads and house floors, and building a permeable dam as sediment traps [54].

Dataset
We used a high resolution Pleiades image and elevation data.Water level observed from a nearby tide station at the time of image acquisition was also used.Those three data sources were provided by the Indonesia Geospatial Information Agency (BIG).
The image was acquired on 27 February 2013 at the lowest tides (the water level was 0.17 m below MSL).Table 1 shows the characteristics of the Pleiades image used.The image is a pan-sharpened ortho product obtained in standard processing level at which pan-sharpening, radiometric and geometric corrections were applied by the image provider.
The DTM was created from Lidar data recorded in August 2014.The data are in UTM projection and elevations are in meters referenced to the Earth Gravitational Model 2008 (EGM 2008).The mission report [55] states that the DTM data have a pixel size of 1.0 m, a vertical accuracy of ±0.17 m (linear error at 90% confidence, LE90), and a horizontal accuracy of ±0.22 m (circular error at 90% confidence, CE90).

Pre-Processing
The DTM was combined with the Pleiades imagery to improve the quality of the classification.The DTM data and Pleiades image must be pre-processed before they can be combined.First, the Geoid-based DTM data needed to be adjusted, so that it coincided with local MSL.In the study area, the Geoid and MSL differ as much as 1.34 m [56].Second, the DTM data were linearly stretched to map its original elevation range (−1.34 to 4.0 m) to the 16-bit range of Pleiades image.Third, the histogram minimum method [57] was applied to the image and the average filter was applied in 3 × 3 window size to reduce the image variance.Fourth, the Pleiades image was co-registered and resampled to match the DTM data.
For comparison of the methods, we prepared two types of datasets: (a) pan-sharpened Pleiades with four bands (referred to as Pleiades); and (b) pan-sharpened Pleiades with four bands that were stacked with the DTM as the fifth band (referred to as Pleiades + DTM).We created 13 subsets and grouped them into four groups (Figure 1), denoted as S a−b as the name of subsets; a is the group number (a = 1, .., 4), and b is the subset number (b = 1, . . .13).Each subset consists of 423 × 282 pixels, except S 2−3 , which consists of 374 × 381 pixels and S 2−13 , which has 317 × 478 pixels.We grouped the subsets based on land cover similarities; a.
S 1 is a mix of settlements and vegetation.This group consists of six subsets.They have a similar land orientation, stretching from northwest to southeast direction indicating rural settlements with a strip land surrounded by inundated fishponds.Rivers with various widths divide each island into two sides and small roads are found on either sides of the river.Rural settlements are mostly concentrated alongside the roads with sparse vegetation coverage.b.
S 2 is a mix of settlement and vegetation with more complex composition.Small rivers are clearly seen in S 2−3 and S 2−12 .Fishponds with irregular shapes are visible at the northern part of S 2−12 and S 2−13 .c.
S 3 is dominated by vegetation coverage.Rural settlements are visible in S 3−5 along the river side and a wide muddy area can be found in the northern part of the subset close to the mangrove area.d.
S 4 shows rural settlements surrounded by inundated fishponds.The settlements are protected by concrete embankment.

FCM Classification
Unsupervised FCM was applied as the clustering algorithm [26] to estimate membership values.It separates the datasets into two classes allowing each pixel to have a membership value to multiple classes.The membership values (µ) range from 0 to 1, and add up to 1 for each pixel.In this work, the membership values of the classification follow the trapezoidal membership function [5].
FCM classification has a parameter c specifying the number of classes, and m specifying fuzziness.Bezdek, et al. [26] stated that values of m between 1.5 and 3.0 produced good results while Deer and Eklund [58] used m = 1.6.In our previous study using Landsat images, we found that m = 1.7 produced an accurate fuzzy classification.In this study, we investigated values of m from 1.1 to 3.0 in steps of 0.1 in to identify the optimal value.We also investigated a c value from 2 to 4 in order to find the optimal number of classes.In addition, we determined the cluster validity index from Xie and Beni [59] as: (1) XB refers to the compactness and separation validity function of fuzzy partition of the dataset X = (1, 2, . . . ,N), where V i (i = 1, 2, . . ., c) is the centroid of class i, N is the number of pixels (or data points), and µ ix is the membership of pixel x to class i.
After clustering, membership images were compiled for each class.We labelled one of the two membership images as belonging to the water class by using the near infrared (NIR) band of Pleiades.The water label was given to the class which has the minimum value of the class mean in the near infrared band [5].

Image Segmentation
The possible shoreline location was determined by generating a margin or transition zone between classes water and non-water.We applied a similar approach [5], by defining a threshold range obtained from parameter estimation in the subsets.We applied thresholding to create crisp boundaries of the transition zone determined by lower (t 1 ) and upper thresholds (t 2 ).The class water C w was defined as: where µ wx is the membership to water, and t is threshold value.We investigated values of t from 0.1 to 0.9 in steps of 0.1 to estimate the optimal threshold value.

Uncertainty Estimation
The uncertainty of deriving fuzzy shoreline was estimated by a measure of confusion index CI for each pixel as follows [15]: where µ ix1 refers to the first highest membership and µ ix2 denotes the second highest membership.
The CI values range from 0 to 1.If the value approaches 1, it means the difference in membership value between the first and the second highest membership is small.The uncertainty of the pixel to belong to the class with the largest membership is high.

Parameter Estimation of Random Sets
Let the intensity of an image I within a window W be denoted as . The intensity function f can be interpreted as a collection of sets = {x ∈ W, R ∈ [0, 1] : f (x) ≥ R}.F becomes a random set when R is a random variable.The distribution of a random set is determined by f and the random variable R [1].
The thresholding was chosen to model the shoreline from the water membership image.Thresholding is a process to separate pixels from an image into regions (or clusters) based on their intensity.Usually this segmentation process is based on the image histogram [60].If the image is composed of regions with clear separation in its histogram, the histogram is usually bimodal with a deep valley.In that case, the bottom of the valley is taken as the threshold for foreground and background separation.However, the choice of threshold is not an easy task due to the existence of uncertain area in between the two peaks of the histogram.There are various methods to find the optimal threshold between the foreground and the background [1,61].In this study, we consider the uncertain area, the transition zone between the foreground and background as the third class, shoreline, with intensity values in the interval [t 1 , t 2 ], where 0 ≤ t 1 ≤ t 2 ≤ 1 (Figure 2).We consider shoreline as the transition zone between water as the foreground and the coastal land as the background.We aim to extract the extent of shoreline and model it as a random set to quantify its extensional uncertainty.The critical part of creating the random sets model is to generate realizations that characterize its distribution.To obtain these realizations, the probability distribution of R was determined and random numbers in the transition zone [t 1 , t 2 ] were generated as multiple thresholds.We chose the Gaussian distribution [1,61], based on the assumption that pixel values close to the object boundary have a higher probability to be labelled as the boundary than pixels at a distance.
A mixed Gaussian model was used to fit a density distribution to the image histogram and to determine the transition zone [t 1 , t 2 ] (Figure 2).When using multi-temporal images for shoreline mapping, each image has a different histogram reflecting a different proportion of transition zones.An image recorded during a low tide has a broader transition zone than an image recorded at a high tide.Hence, we chose the mixed Gaussian model with three components: the distributions of water, non-water, and shoreline as the transition zone.
Let the three classes be denoted as non-water (C 1 ), shoreline (C 2 ) and water (C 3 ).We assumed that the intensity of pixels which belong to class C i , i ∈ {1, 2, 3} follows the Gaussian distribution.C i has the Gaussian distribution C i ∼ N(M i , Σ i ) with mean M i , standard deviation Σ i and density function i in a one-dimensional model.The density function of I is the mixed density distribution of C i : where z = f (x), Θ i is the weight coefficient for C i and 3 The lower limit of the shoreline is to be determined at t 1 where , and the upper limit at t 2 where In this way, we identify three classes as presented in Figure 2: non-water (C 1 ) The transition interval [t 1 , t 2 ] is determined by tuning the weight of the shoreline component.For example, suppose that the thresholds 0.4, 0.5, and 0.6 were adopted for shoreline hard classification, and then we investigated an interval around these values, to find the optimal threshold interval for random sets.

Modelling the Extent of Shoreline by Random Sets
We generated n random numbers from the distribution . This results into different realizations of a random set O i , . . ., O n by thresholding the water membership image using R i , . . ., R n as the multiple thresholds: We investigated the optimal value of n (n = 10, . . ., 300) in steps of 10.Intuitively, a value closer to the optimal n should be more reliable and the variance of random sets Γ becomes stable as n increases.For each n, the covering function Pr Γ (x) can then be determined including the core, median, support and level sets.We provided a curve for the core area as a function of n values.If the difference of two standardized core area between two successive n (denoted as d i ) is small (e.g., in the range −1 to +1), we accepted this n as the optimal n.
The idea of the generation of random sets is that the extent of segmented shoreline objects should be sensitive to the variance of the parameter in the threshold when we extracted objects with a large extensional uncertainty.By slightly changing the threshold values R i , . . ., R n , we obtained a set of objects O i , . . ., O n and construct a random set Γ.For example, for n = 100 and threshold interval [0.3, 0.7], this means that we generate 100 thresholds to slice the membership image and make samples as binary maps.Each sample is a realization of focal element O i of random sets Γ.The focal elements are regions which are subsets of W: O i ∈ P (W).If the random set is constructed by n focal elements with equal probability, then u i = 1/n.We need to estimate the covering function Pr Γ (x) to measure the probability of pixel x being covered by random sets.The covering function characterizes the distribution of random sets Γ.The covering function Pr Γ (x) at point x equals P(Γ ∩ {x} = 0) = P(x ∈ Γ).It can be described by focal elements O i with corresponding uncertainty assignments u i , indicated as a collection of pairs {O i , u i }, i ∈ (1, . . . ,n) [38].The covering function of random sets can be estimated by [1,38]: where Figure 3a illustrates a simple example for covering function estimations of random sets with equal uncertainty assignments reflected by equal interval u i .Figure 3b shows the covering function values at six pixels derived by Equation (5).Table 2 provides the statistical parameters of random sets [62].

Definition Equations
The α-level set: to describe the spatial distribution of the varying sizes of The core set: to describe the certain part of Γ The median set: to describe the 0.5-level set The support set: to describe the possible part of The mean set of The set-theoretic variance The sum of variance SV The coefficient of variation

Validation and Comparing Classification Performance
To quantify the accuracy of each model, we used the error matrix to estimate the κ (kappa) values.In this case, we produced a hardened FCM using t = 0.5 and the median set Γ 0.5 .We compared the performance of both approaches using two input images (Pleiades and Pleiades + DTM).
Reference data were derived from the 0.5 m Pleiades image acquired in 2013.Using stratified random sampling, approximately 138 points were randomly selected for each subset.A visual interpretation approach was performed to distinguish a land cover class for each selected point.
To test the significance of the difference between: a) fuzzy sets and random sets; b) Pleiades and Pleiades + DTM, we performed McNemar's test [63][64][65].McNemar's test is based on confusion matrices that are 2 by 2 in dimension.The null hypothesis stated that both input images produced similar accuracy.The test is based on chi-square statistics at the 95% level of confidence, and computed as follows: where f 12 denotes number of cases that are incorrectly classified by the first method or the first image but correctly classified by the second method or the second image, and f 21 denotes number of cases that are correctly classified by the first method or the first image but incorrectly classified by the second method or the second image.

Parameter Estimation of FCM Classification
Figure S1 shows the κ values when we estimated c and m values for all subsets of the Pleiades + DTM image.For low m (e.g., m = 1.1-1.6),classifications show a comparable κ for all thresholds, and the highest κ was obtained for c = 2, while by setting a high m (m = 2.0-3.0),high κ values were obtained only for certain t values.For example, when we set m = 1.5 and c = 2, high κ were obtained for t = 0.2-0.8 while for m = 3 and c = 3, high κ were only obtained for t = 0.2.In this case, for a high m value, the choice of t becomes more sensitive.In addition, Table S1 shows the cluster validity measures as an alternative approach to determine the number of classes for FCM classification.From the results, c = 2 obtained the minimum values for all m which indicates a partition in which all clusters are overall compact, and separate from each other.Based on both approaches in estimating the suitable number of classes for FCM classification, we decided c = 2 was the optimal number of classes for further image processing steps.
Figure S2 shows the results of threshold range estimation when we set a constant c (c = 2) for various m.We can see that t = 0.5 gave a highest κ value for all subsets while the threshold range 0.3-0.7 provided high κ values.High values of m resulted in a low κ value, especially at a low t (t < 0.3) and a high t value (t > 0.6).Given the fact that the threshold range 0.3-0.7 produced high κ values, we selected those values as the threshold range at which the boundary between water and land can probably be distinguished.
In Figure S2, we can also see that m = 1.5 and m = 1.6 are comparable as indicated by the stability of κ value, whereas, for m > 1.6, the choice of t becomes more sensitive.Given the results, we chose m = 1.6 as the optimal m value for FCM.

Hardened FCM and Accuracy Comparison
Figures 4 and 5 show the comparison of thresholding results for hardened FCM at t = 0.5 for both input images.The inclusion of DTM data has improved the classification results.In Figure 4, an example is presented at which roofs (non-water) were correctly classified by Pleiades + DTM, but were classified incorrectly by Pleiades.The classification improvement also can be seen from Figure 5 provides an example in which inundated land was clearly identified by Pleiades + DTM.
Table 3 shows the comparison of the accuracy between Pleiades and Pleiades + DTM.For all subsets, Pleiades + DTM outperformed Pleiades.Table S2 presents the significance of the different accuracies given by both images.Seven of the tests show significant improvement of the Pleiades + DTM over the Pleiades, as shown by very low p-values, whereas, few results have similar accuracies, as shown by p-value ≥ 0.05.   Figure 6 shows an illustration of shoreline margin with fuzziness generated by setting t = 0.3 as the lower t and t = 0.7 as the upper t by using Pleiades + DTM (for other results, see Figure S3).In this figure, the shoreline (in light green colour) represents the transition zone between water (in blue colour) and non-water (in black colour).The combination of the shoreline image and the confusion index is provided in Figure 6d.In this figure, a wider shoreline indicates a wider gradual transition between water and non-water representing a more gently sloping beach or muddy coastal area, while a narrow shoreline indicates a steeper sloping beach.

Parameter Estimation Results
Table S3 show the results of parameter n and threshold interval estimation of random sets with the related κ values estimated from the Γ 0.5 .In Table S3 and Figure 7, it can be seen that threshold interval = 0.3-0.7 generally produced the highest κ value.
We plotted the curves of the Γ 1 area for four subsets by setting the selected threshold interval = 0.3-0.7 (see Figure 8 and Figure S4).From these curves, we can assess the optimal n at which we obtained a stabile Γ 1 area.Each subset has a different n to reach a stabile Γ 1 area, which might be influenced by the land cover characteristics of the study area.In Figure 8, the curve of S 4-4 reached the highest n value to achieve the stability of the Γ 1 area, whereas S 2-12 reached the lowest n.

Uncertainty Modelling of Shoreline Objects
Figure 9 shows some examples of binary images and the related covering functions that resulted from slicing water membership images determined by the optimal n (the other results can be seen in Figure S5).By slightly changing the threshold for µ wx , we obtained binary maps as a realization of focal element O i with various extents.From these focal elements, we constructed random sets by estimating the covering function, as can be seen in Figure 9f.We plotted the area of focal elements to explore information on the extent of random sets (see Figure 10).From the plot in Figure 10, we can see that S 2−12 , S 2−13 , and S 3−5 have the largest variance, whereas S 4−4 , S 1−10 and S 1−9 have the smallest variance.In Table 4, we can see that subset S 4−4 has the lowest CV value.A lower CV indicates that the random sets have small Γ v , reflecting a lower uncertainty.By checking the Pleiades image in Figure 1, it is obvious that S 4−4 comprises a rural settlement with concrete roads.The settlement was protected by embankment from its surrounding open water.For an object with little uncertainty, the membership values are homogenous.Therefore, the resulting samples O 1 , . . ., O n have similar extents (see Figure 10 subset S 4−4 ).On the contrary, S 2−12 has the highest CV value, which obviously indicates the highest uncertainty.For an object with a high uncertainty, the membership values are heterogeneous.Hence, the resulting samples O 1 , . . ., O n have various extents and are very sensitive to small variations in t value (see Figure 10 subset S 2−12 ).Table 4.The quantification of the extensional uncertainty of the all subsets (the SV is the sum of variance, and CV denotes the coefficient of variance).See notations in Table 3 for the name of subsets.

Subset
SV CV In Figure 11a, the set-theoretic variance Γ v is presented in grey scale values with light colour denoting high variations in uncertain transition zones and dark colour denoting low variations of water and non-water (the other results can be seen in Figure S6).For pixels inside Γ 1 or outside Γ 0 , Γ v equals 0, whereas pixels close to the contours of Γ 1 or Γ 0 have Γ v values in the range [0,1].Figure 11b shows the contours of Γ 1 , Γ m and Γ 0 of random sets.The yellow rectangle sites (1) in Figure 11a,b have a different extent implying that these sites have a wider, more gradual transition (see Figure 11c), mainly caused by the location close to the mangrove forest in a muddy area.For the yellow sites (2), however, the contours of Γ 1 , Γ m and Γ 0 are similar and the segmentation boundaries show small variation (see Figure 11d).More pixels with a non-zero Γ v in the objects indicate a large uncertainty area.SV values are the largest for S 2-12 , S 2-13 , and S 3-5 (see Table 4) because the number of pixels with non-zero Γ v values in those subsets are the largest (see Figure S6).Subsets S 1-9 , S 4-4 and S 4-8 have the smallest CV, which can be observed well in Figure S6 indicating a small number of pixels with a non-zero Γ v .
The extent of the shoreline is represented by a random sets model in Figure 12 as an example of the representations of the core set Γ 1 , the support set Γ 0 , and the covering function Pr Γ (x) of random sets (see Figure S7 for other results).Figure 12a shows the Γ 1 (in blue pixels) representing the area that obviously belongs to water.Figure 12b displays the Γ 0 (in blue colour) indicating the possible part of the area that belongs to water, whereas the area outside this Γ 0 belongs to non-water (see black pixels in Figure 12d).

Accuracy Assessment of Random Sets Results
Table 5 shows the comparison of accuracy between Pleiades and Pleiades + DTM by using random sets.Similar to fuzzy sets, using random sets, Pleiades + DTM outperformed Pleiades.

Comparing Classification Performance
Table S5 presents the McNemar's test results by fuzzy sets and random sets using Pleiades image.The table indicates that the methods agree on f22 and f11 but disagree on f12 and f21 cases.From the test results, we can see that p-values are relatively high (≥0.05)implying that both methods obtained a similar accuracy when using Pleiades.
Table S6 presents the McNemar's test results by fuzzy sets and random sets using Pleiades + DTM image as input data.From the test results, we can see that p-values are relatively high (≥0.05)implying that both methods obtained a similar accuracy when using Pleiades + DTM.

Discussion
This research compared two methods for handling the uncertainty of shorelines: fuzzy sets and random sets.Shoreline is a spatial object with inherent uncertainty that cannot be extracted effectively from satellite images by means of a crisp-based classification, since these methods ignore uncertain areas or gradual transition zones.This paper demonstrates that fuzzy sets and random sets produced comparable results for modelling the uncertainty of fuzzy shorelines.When using fuzzy sets, the same results can be achieved without taking randomness into account, as confirmed by Zadeh [39].
The κ accuracies from both fuzzy sets and random sets are slightly different (see Tables 3 and 5).In addition, the McNemar's test failed to reject the null hypothesis of equal performance of both methods by using either Pleiades or Pleiades + DTM (see Tables S5 and S6).Although fuzzy sets and random sets were not identical, shorelines resulted from both methods were close to each other (see Figures 6d and 12e) and neither could be considered more accurate, as confirmed by the literature [43,44].This is probably related to the fact that each segmentation of random sets can be interpreted as a different interpretation of a fuzzy concept, since the multiple thresholds to generate the segments are selected among the possible interpretations [43].Furthermore, Goodman argued that any given fuzzy sets is equal to one nested random set [44].
Both methods were successful in identifying the spatial extent of shorelines including their extensional uncertainties.Fuzzy sets present a shoreline as a margin derived from a crisp boundary determined by t values.Here, the extensional uncertainty of shoreline represented by a confusion index values implies that the shoreline can be detected with limited certainty.Through the confusion index, the presence of a gradual transition was distinguished when the values of an adjacent grid are very similar.When using random sets, a shoreline is presented as the third class, the transition zone between water and non-water.The extensional uncertainty of a shoreline was assessed by using the covering function of random sets and its statistical parameters (Γ 1 , Γ m , Γ 0 and Γ v ).By using these parameters, we demonstrated that the randomness of segmentation parameters, i.e., multiple thresholds, has a different effect on extracted features when objects have different extensional uncertainties (see Figure 9).Moreover, there are other indicators such as SV and CV to summarize the size of extensional uncertainties.A high SV and CV indicate a high extensional uncertainty.
Fuzzy sets were applied by first estimating the membership function.In this study, we computed the membership value by performing FCM classification.On the one hand, this method is less subjective as compared to the semantic import model [24,66], while, on the other hand, the choice of values for c and m influence the results of the classification.In contrast, random sets as a probabilistic approach avoid user interference [34,67] in generating random sets.
The random sets model was combined with thresholding to model shorelines from water membership images.Here, the choice of n as the number of focal elements was critical.Improper threshold values R i , . . ., R n , from the worst n values in segmentation will result in errant segments.At low n, the Γ 1 area changed abruptly, and by the increasing of n values, the Γ 1 area reached its stability.In fact, by increasing n, we increase the chance to have optimal threshold values for segmentation of random sets.Performing random sets modelling with such large n values was computationally expensive.Comparing random sets to fuzzy sets, fuzzy sets were relatively computationally less expensive.However, the choice of optimal c, m, and t values for classification influences the results and requires a thorough investigation.
To model a shoreline using fuzzy sets, we need to adopt other concepts to quantify the extensional uncertainty of the shoreline, such as α-cut, shoreline as a margin, and fuzzy-crisp object [4,5,28], whereas a random sets approach through its covering function and statistical parameters directly quantifies the extensional uncertainty of shoreline without resorting to other concepts.
The integration of DTM data improved the results of both fuzzy sets and random sets.The integration of Pleiades and elevation has higher accuracy than Pleiades only.The additional DTM band leads to an improvement in the classification accuracy for roofs, inundated houses and inundated land.After this integration, roofs were clearly distinguished and separated from their surroundings (i.e., water and inundated soil).Usually, the ground close to the building location is slightly higher than its surroundings while water area or an inundated land clearly has a lower elevation.By using only Pleiades, it was difficult to discriminate dark roofs and water or inundated soil, since they are often have similar spectral characteristics.The ability to discriminate two similar characteristics is influenced by the number of spectral bands available.The other objects that were successfully classified from the addition of DTM were inundated houses and land.In this case, the elevation data help to identify the water area.In addition to the benefit given by addition of the DTM in the classification, a downside could be found as well, especially for tree objects.This is due to the time difference between Pleiades and DTM data of one and half years.In several locations, trees were submerged and finally no longer exist in newer data and these changes caused loss of accuracy.In this case, the use of DTM data that have the same date of acquisition as the remote sensing image is preferable.

Conclusions
In this paper, fuzzy sets and random sets are compared for shoreline detection.Both methods performed well in modelling the uncertainty of shorelines and had similar results when using either Pleiades or Pleiades + DTM.
Application of fuzzy sets produced a higher classification accuracy for Pleiades + DTM than for Pleiades.Similarly, for random sets, Pleiades + DTM gained a significant improvement over Pleiades.Considerable improvements were achieved for objects, e.g., roofs, inundated houses and yards.Pleiades + DTM achieved an accuracy above 80%, demonstrating that it provides a valuable data source for shoreline mapping.In the absence of elevation data, we may overestimate in particular the water area.The research further confirmed the need of DTM integration to remote sensing images to provide reliable and accurate shoreline mapping that may give benefit to coastal planners and managers.The proposed methods are to be further applied in other areas for future study.This will help to better understand how different condition of the area can influence the results and to upscale the methods to larger areas of land.S3: The optimal n selected for each threshold interval and the related κ values for generation of random sets; Figure S4: The curve of differences between two successive standardized core sets d i ; Figure S5: Samples of the random sets with various extents and their covering functions; Figure S6: The set-theoretic variance and the contour of random sets; Figure S7: An example of random sets; the core set Γ 1 and its contour (Columns 1 and 5); the support set Γ 0 and its contour (Columns 2 and 6); the transition zone between water and non-water represented by the set-theoretic variance (Columns 3 and 7); and zooming into the yellow rectangle sites (Columns 4 and 8); Table S4: The results of McNemar's test showing the significance of the different accuracies given by Pleiades and Pleiades + DTM (α = 0.05) in random sets; Table S5: The results of McNemar's test showing the significance of the difference given by fuzzy sets and random sets (α = 0.05) by using Pleiades; Table S6: The results of McNemar's test showing the significance of the difference given by fuzzy sets and random sets (α = 0.05) using Pleiades + DTM data.

Figure 1 .
Figure 1.The study area, located in Sayung sub-distric, Central Java Province, Indonesia.It is presented here as a false colour composite of a Pleiades image with red colour representing the vegetation, bluish green showing water area, and greyish and white pixels showing the built-up area.Yellow rectangles represent several the selected sites for this work, and black-dashed rectangles show four groups of subsets.

Figure 2 .
Figure 2. Density functions of shoreline object and related mixed Gaussian model.

Figure 3 .
Figure 3. Focal elements with their equal uncertainty assignments u 1 = u 2 = u 3 to construct a realization of random sets (a); and covering function of the random sets (b).These figures are adapted from Zhao et al. [62].

Figure 4 .
Figure 4. Comparison of the fuzzy classification results between: Pleiades (a,c); and Pleiades + DTM (b,d).Pleiades 0.5 m (e); and elevation data (f) are displayed to interpret the attribute of yellow points.In (c,d), we can see that Pleiades misclassified pixels as water instead of roofs (non-water), as can be seen in (e).

Figure 5 .Table 3 .
Figure 5.An example of inundated land that was: incorrectly classified by Pleiades (a,c); and classified successfully by Pleiades + DTM (b,d).Pleiades 0.5 m (e); and elevation data (f) are presented to interpret the yellow points.

Figure 6 .
Figure 6.The shoreline as the transition zone between water and non-water (a); the fuzziness of the shoreline is represented by the confusion index denoting the uncertainty of pixels to be classified to the largest membership (b); zooming into the white-dashed rectangle sites (c); and shoreline image with fuzziness represented by the confusion index (d).

Figure 7 .
Figure 7. Estimation of threshold interval for random sets based on the optimal n selected for each subset.Threshold interval = 0.3-0.7 generally produced the highest κ value.S a−b : the name of subsets, a is the group number (a = 1, .., 4) and b is the subset number (b = 1, . . .13).

Figure 8 .
Figure 8.The curve of differences between two successive standardized core sets d i .When d i is in the range −1 to +1, we determined this n value for performing random sets (see notations in Figure7for the name of subsets).

Figure 9 .
Figure 9.Samples of the random sets with various extents and their covering function.(a-e) Samples are at µ wx = 0.3-0.7.Pixels in white indicate the water area and pixels in black indicate the non-water area.(f) The related covering functions, where 0 indicates a low probability and 1 indicates a high probability to be covered by the random sets.Various extents of focal elements at each binary map can be seen when zooming into the yellow rectangle site.

Figure 10 .
Figure 10.Statistical distribution of area of focal elements sampled from 13 random sets (see notations in Figure 7 for the name of subsets).

Figure 11 .
Figure 11.The set-theoretic variance (a); some examples of the contour of Γ 1 , Γ 0.5 and Γ 0 (b); and a detail representing the yellow rectangle sites as an example of contours with a broad variation (c); and contours with a small variation indicating a narrow shoreline (d).
The gradual changes in the transition zone representing the shoreline are represented by the set-theoretic variance Γ v .Pixels with value close to 1 have a high variation indicating a high uncertainty, whereas, pixels with value close to 0 have a low variation indicating a low uncertainty (Figure 12d,e).A clear distinction exists between a narrow transition zone, for example, which separates settlements and open water (Figure 12e, e.g., grid cells A2 and B2), and broad transition zones between open water and vegetation (Figure 12e, e.g., grid cells B3).

Figure 12 .
Figure 12.An example of a random set: the core set Γ 1 and its contour (a); the support set Γ 0 and its contour (b); the set-theoretic variance image (c); the transition zone between water and non-water represented by the set-theoretic variance values (d); and zoom-in to the yellow rectangle site (e).
McNemar's test results of random sets using Pleiades and Pleiades + DTM are shown in Table S4.Seven of the subsets show significant improvement of the Pleiades + DTM over the Pleiades image, as shown by their very low p-values (see Table subsets S 1−1 , S 1−2 , S 1−7 , S 1−9 , S 1−10 , S 3−5 , and S 4−4 ), whereas the rest of the results show that similar accuracies were obtained from both of them.

Supplementary Materials:
The following are available online at www.mdpi.com/2072-4292/9/9/885/s1,FigureS1: The estimation results of c and m for FCM classification; TableS1: The cluster validity index showing the compactness and the separateness among all clusters (applied using m = 1.6); Figure S2: The estimation of threshold interval for c = 2 and various m values for FCM classification; Table S2: The results of McNemar's test showing the significance of the different accuracies given by Pleiades and Pleiades + DTM (α = 0.05) in FCM classification with thresholding; Figure S3: The shoreline as the transition zone between water and non-water (Columns 1 and 5); confusion index images (Columns 2 and 6); zooming into the white-dashed rectangle sites (Columns 3 and 7); and shoreline images with fuzziness represented by the confusion index (Columns 4 and 8); Table ).It is 4.6 × 4.2 km, or 4618 × 4262 pixels.The central point of the area is at Geographical coordinates 6 • 56 S and 110 • 29 E.

Table 1 .
The characteristics of Pleiades image used.

Table 2 .
The statistical parameter of random sets.

Table 5 .
The accuracy comparison between Pleiades and Pleiades + DTM by random sets (see notations in Table3for the name of subsets).