An Enhanced IT 2 FCM * Algorithm Integrating Spectral Indices and Spatial Information for Multi-Spectral Remote Sensing Image Clustering

Interval type-2 fuzzy c-means (IT2FCM) clustering methods for remote-sensing data classification are based on interval type-2 fuzzy sets and can effectively handle uncertainty of membership grade. However, most of these methods neglect the spatial information when they are used in image clustering. The spatial information and spectral indices are useful in remote-sensing data classification. Thus, determining how to integrate them into IT2FCM to improve the quality and accuracy of the classification is very important. This paper proposes an enhanced IT2FCM* (EnIT2FCM*) algorithm by combining spatial information and spectral indices for remote-sensing data classification. First, the new comprehensive spatial information is defined as the combination of the local spatial distance and attribute distance or membership-grade distance. Then, a novel distance metric is proposed by combining this new spatial information and the selected spectral indices; these selected spectral indices are treated as another dataset in this distance metric. To test the effectiveness of the EnIT2FCM* algorithm, four typical validity indices along with the confusion matrix and kappa coefficient are used. The experimental results show that the spatial information definition proposed here is effective, and some spectral indices and their combinations improve the performance of the EnIT2FCM*. Thus, the selection of suitable spectral indices is crucial, and the combination of soil adjusted vegetation index (SAVI) and the Automated Water Extraction Index (AWEIsh) is the best choice of spectral indices for this method.


Introduction
Land-use/land-cover (LULC) mapping using remote-sensing data is crucial.The main task of LULC mapping is to identify different land-use types by mapping multi-scale remote-sensing data.The fuzzy c-means (FCM) clustering [1] is a classical unsupervised soft clustering method and widely used in this domain because of its ability to handle fuzzy uncertainty.Classical FCM clustering methods are based on type-1 fuzzy set theory, which cannot address uncertainties associated with membership grade [2].Some researchers adopt the interval type-2 fuzzy sets (IT2 FSs) to improve FCM clustering, and there are three strategies to avoid this problem: (1) The remote-sensing data expressed by real values are extended to interval numbers [3,4].However, the widths of the intervals are difficult to determine because a single value must be related to an interval value, which can artificially increase the uncertainty in the data.(2) Hwang and Rhee [5] proposed the IT2FCM method using two fuzzifiers (m 1 and m 2 ) and IT2 FS; this strategy is being extensively discussed.(3) The land-use types are classified by FCM with different fuzzifiers, and the results are integrated using IT2 FS [6].Due to its ability to handle the uncertainty of membership values, the IT2FCM is widely used, and many derivative methods of IT2FCM have been developed, including the interval type-2 fuzzy possibilistic C-means (IFPCM) [7], interval-valued possibilistic fuzzy C-means (IPFCM) [8], general type-2 fuzzy C-means (GT2 FCM) [9], interval type-2 fuzzy C-means clustering with spatial information (IIT2-FCM) [10], and kernel interval-valued fuzzy C-Means (KIFCM) clustering algorithms [11].As noted by Zarinbal et al. [12], in some of these methods, the type-2 fuzzy membership functions are defuzzified into type-1 fuzzy membership functions during each iteration, and the distances between a sample and cluster centers should be expressed as singleton values when used to calculate the lower and upper membership grades in a certain class; otherwise, in these cases, some information would be lost.However, beyond that, the spectrum of a geographic feature does not refer to a single spectral curve but rather to a connected set of possible spectral curves, namely, a spectral curve with a certain width [13].To deal with the uncertainty in remote-sensing data, an interval type-2 fuzzy C-means clustering method based on the interval number distance (IND) and ranking, named the improved interval type-2 fuzzy c-means (IT2FCM*) clustering method, is proposed [14].
However, classical FCM clustering approaches suffer from a lack of spatial information at the pixel level.In other words, they rely on only the intensity distribution of the pixels and disregard their geometric information, which make them very sensitive to noise and other artifacts introduced during the imaging process [15].To overcome this shortcoming, efforts have been made to integrate spatial information into FCM clustering, and the relevant methods can be grouped into three types: (1) those that integrate spatial information during each iteration, for example, the bias corrected FCM (BCFCM) [16] and Interval Type-2 Fuzzy C-Means clustering with spatial in-formation (IIT2-FCM) [10]; methods of this kind are of low efficiency; (2) those that integrate spatial information in the data processing step, such as the enhanced FCM (EnFCM) [17], the fast generalized FCM (FGFCM) [18] and the noise detecting FCM (NDFCM) [19], in which the spatial information is usually defined as the spatial similarity or gray similarity in a certain local window; with these methods, it is unnecessary to compute the neighbor information in each iteration; and (3) those that analyze the uncertainty in the FCM classification result and reclassify the pixels using a W × W window [20].It is very difficult to strike a balance between the noise insensitivity and retention of image details or local contextual information [21], especially for remote-sensing images, because most pixels are made up of spectral features of different geographical objects in remote-sensing images; these are also called mixed pixels.Therefore, defining the spatial information is still a problem in remote-sensing classification.
Since the conceptual vegetation-impervious surface-soil triangle model proposed by Ridd [22] was first used in land-use/land-cover mapping with remote-sensing data, many spectral indices have been developed and widely used in applications, e.g., the normalized difference vegetation index (NDVI) [23], the soil adjusted vegetation index (SAVI) [24], the normalized difference water index (NDWI) [25], the normalized difference built-up index (NDBI) [26], the morphological building index (MBI), the morphological shadow index (MSI) [27], and the automated water extraction index (AWEI) [28].Although these indices have proven effective in labeling special ground components [29][30][31], crucial problems can arise when they are applied to urban environments.The first problem is that most spectral indices are designed to highlight only a single land-cover type (e.g., vegetation or built-up area) and cannot differentiate among other land-cover types.The second problem of spectral indices is associated with their limited applicability in remote-sensing imagery at different spatial and spectral resolutions [32].It is necessary to set threshold values for different spectral indices to segment different land-cover types in common methods [33][34][35].However, it is difficult to find suitable threshold values.Thus, soft classification is another option, e.g., Yang et al. [36] discussed four combination scenarios of the modified FCM (MFCM) with water indices (WIs) and proposed a new surface water extraction method, the selected water index, which was treated as a special feature in MFCM.
The objective of the present study is to integrate the spatial information and spectral indices into the IT2FCM* method, and named as Enhanced IT2FCM* (EnIT2FCM*).The technical approach of this paper is shown in Figure 1.The Sentinel-2 dataset is used to test the EnIT2FCM* and ten bands of this dataset are selected; Section 4.1 will introduce this dataset in detail.The paper is organized as follows.In the next section, a brief introduction to spectral indices and spatial information used in FCMs is given and IT2FCM* is introduced.In Section 3, the main ideology for constructing the algorithm proposed in this paper is described.We construct a new spatial information description by membership degrees that is a combination of spatial similarity and membership value similarity.Then, the remotely sensed images are treated as one dataset source and the selected spectral indices as the other dataset source.A new distance is defined by these two datasets and spatial information.In Section 4, we further analyze this proposed method.As noted previously, spectral indices may be confused or in conflict with one another [32]; therefore, selecting suitable indices is critical.Thus, we compare different combinations of indices and then test the effectiveness of the new spatial information definition by comparing it with the existing definition.Validity indices for the EnIT2FCM* algorithm include PE-, PC-, XB-and FS-, and other indices such as the confusion matrix and kappa coefficients are also adopted to validate the classification results.
Remote Sens. 2017, 9, 960 3 of 20 of this paper is shown in Figure 1.The Sentinel-2 dataset is used to test the EnIT2FCM* and ten bands of this dataset are selected; Section 4.1 will introduce this dataset in detail.The paper is organized as follows.In the next section, a brief introduction to spectral indices and spatial information used in FCMs is given and IT2FCM* is introduced.In Section 3, the main ideology for constructing the algorithm proposed in this paper is described.We construct a new spatial information description by membership degrees that is a combination of spatial similarity and membership value similarity.Then, the remotely sensed images are treated as one dataset source and the selected spectral indices as the other dataset source.A new distance is defined by these two datasets and spatial information.In Section 4, we further analyze this proposed method.As noted previously, spectral indices may be confused or in conflict with one another [32]; therefore, selecting suitable indices is critical.Thus, we compare different combinations of indices and then test the effectiveness of the new spatial information definition by comparing it with the existing definition.
Validity indices for the EnIT2FCM* algorithm include PE-, PC-, XB-and FS-, and other indices such as the confusion matrix and kappa coefficients are also adopted to validate the classification results.

Preliminaries
Relevant preliminaries are presented in this section.In Section 2.1, some spatial information definitions are reviewed.In Section 2.2, the basic concept of IT2 FS is introduced.The IT2FCM* algorithm is introduced in Section 2.3.

Spatial Information in FCM
As noted in Section 1, spatial texture information is considered in FCMs, and many researchers have discussed this topic in the past two decades.We will review some typical spatial information definitions.In BCFCM [16], the objective function is defined as follows: where C is the number of clusters, N is the number of pixels in dataset, are the centroids of all clusters, is the value of i-th pixel, represents the membership degree of j-th pixel to cluster i, Nj is the set of neighbors that exists in a window centered around j-th pixel and NR is the cardinality of Nj, is the value of r-th pixel in Nj and α controls the effect of the spatial information.The neighbor information must be computed in each iteration, so its efficiency is low.Chen and Zhang [37] proposed FCMS1 and FCMS2 by replacing the spatial information with median-filtered and mean-filtered information respectively.In EnFCM, a linearly weighted sum image ξ is formed by the original image and its local neighbor average image in the data processing step:

Preliminaries
Relevant preliminaries are presented in this section.In Section 2.1, some spatial information definitions are reviewed.In Section 2.2, the basic concept of IT2 FS is introduced.The IT2FCM* algorithm is introduced in Section 2.3.

Spatial Information in FCM
As noted in Section 1, spatial texture information is considered in FCMs, and many researchers have discussed this topic in the past two decades.We will review some typical spatial information definitions.In BCFCM [16], the objective function is defined as follows: where C is the number of clusters, N is the number of pixels in dataset, {v i } C i=1 are the centroids of all clusters, x i is the value of i-th pixel, u ij represents the membership degree of j-th pixel to cluster i, N j is the set of neighbors that exists in a window centered around j-th pixel and N R is the cardinality of N j , x r is the value of r-th pixel in N j and α controls the effect of the spatial information.The neighbor information must be computed in each iteration, so its efficiency is low.Chen and Zhang [37] proposed FCMS1 and FCMS2 by replacing the spatial information with median-filtered and mean-filtered information respectively.In EnFCM, a linearly weighted sum image ξ is formed by the original image and its local neighbor average image in the data processing step: where ξ i is the gray-level value of the i-th pixel of the image ξ, N j is the set of neighbors that exists in a window centered around j-th pixel and N R is the cardinality of N j , the parameter α plays the same role as before.Because ξ is calculated during the data preprocessing, this algorithm runs faster than the algorithms that integrate spatial information during each iteration.
In FGFCM, a local similarity measure that combines both spatial and gray-level image information is defined as follows: where the i-th pixel is the center of the local window and the j-th pixel represents the set of neighbors that fall within the local window around the i-th pixel, (p i , q i ) is the coordinate of the i-th pixel, x i is its gray-level value, λ s and λ g are two scale parameters that play roles similar to that of parameter α in EnFCM, and σ i is calculated as follows: Then, a new image (linearly weighted sum image) ξ can be generated by the following: where ξ i be the gray-level value of the i-th pixel of the new image ξ.Then, the original image is replaced by ξ in this algorithm.Clearly, a pixel ξ i in ξ no longer contains the intrinsic information x i because when I = j, S ij equals 0 in Equation (5).The algorithm proposed by Wang and Bu [38] has the same problem.It is common sense that spatial, textual and spectral information are the most important ones for human visual interpretation in remote-sensing image classification.
As some authors have noted, it is very difficult to strike a balance between noise insensitivity and image details.Previous works have proposed several methods to achieve this balance automatically; for example, Krinids and Chatzis proposed the fuzzy factor to express local spatial information in their fuzzy local information c-means (FLICM) algorithm [39]: where the j-th pixel is the center of the local window, i is the reference cluster, the r-th pixel represents the set of neighbors that fall within the local window around the j-th pixel, d jr is the spatial Euclidean distance between pixels r and j, u ir is the membership degree of the r-th pixel to cluster i, v i is the center of cluster i, and m is the fuzzifier.Then, the objective function is expressed as follows: Guo et al. [19] proposed the noise point probability for determining the balance parameters, which is expressed as follows: where λ α is a given parameter for controlling the scale and N i and N R still represent the corresponding neighborhood window and the number of pixels in it, respectively.Then, the objective function is expressed as follows: where ξ i is the weighted mean computed according to ( 5) and x i is the mean of the corresponding neighborhood.Guo et al. claimed that the larger the difference between the central pixel and its surrounding pixels is, more likely it is that the pixel is a noise point [19].
Inspired by the method proposed by Cai et al. [18] and Krinids and Chatzis [39], Zhang et al. [21] proposed a novel algorithm based on the concept of pixel relevance.The pixel relevance between two pixels is measured by two image patches centered on them.Then, the fuzzy factor is inferred from the membership degree and remote-sensing dataset, and it is calculated during each iteration.However, this process neglects the spatial distance between the center pixel and its surrounding pixels, and it requires very high computational resources if the dataset has a large amount of attribute information, as is the case for hyperspectral images.Therefore, we propose a new spatial information definition in Section 3 to overcome these faults.

The Interval Type-2 Fuzzy Set
Fuzzy sets (i.e., type-1 fuzzy sets) have been applied to many domains because of their ability to model fuzziness [40].However, fuzzy sets cannot manage the errors associated with the membership values of fuzzy objects.This flaw can be overcome by using type-2 fuzzy sets (T2 FS) and type-N fuzzy sets, which were introduced by Zadeh [41].These shortcomings were recognized following the introduction and development of the type-1 fuzzy set theory, after which type-2 fuzzy set theory began to receive more attention.Some researchers provided a full comparison between type-1 and type-2 fuzzy sets and summarized the development and application of the latter [2,42].
A common type-2 fuzzy set features a non-interval secondary membership function, which makes computation extremely difficult.Additionally, the secondary membership value or function is difficult to handle.The interval type-2 fuzzy set (IT2 FS) is a special case of the T2 FS and many authors have referred to IT2 FS as T2 FS and added the qualifying term "generalized" only when discussing non-interval type-2 fuzzy sets [2].The next portion of this section introduces concepts that are relevant to these interval type-2 fuzzy sets.
Definition 1 [43].An interval type-2 fuzzy set X A is a bivariate function on the Cartesian product, µ : , where X is the universe for the primary variable x of A. The point-valued representation of A is For the sake of convenience, the IT2 FS is represented as A, and represents the membership value of the element x in A.

IT2 FCM*
Two parameters that can be set by users are the number of classes C and the fuzzifier m [1] in the standard FCM method.The cluster centers are expressed by real-valued vectors, and the distances between a sample and the cluster centers are used to determine the membership grade of a sample belonging to a class.As noted above, classical FCM clustering methods cannot handle uncertainty of the membership degree.Hwang and Rhee proposed IT2FCM based on the IT2 FS to handle fuzziness uncertainty in fuzzy clustering [5].The lower and upper membership functions are constructed using two fuzzifiers-m 1 and m 2 , respectively-but IT2FCM and its extended algorithms, such as IIT2-FCM, still have some faults, which are discussed in Section 1.Thus, we proposed IT2FCM* based on the interval number distance and ranking [14].
There are multiple definitions of the distance between interval numbers.The Euclidean distance between interval numbers is commonly used, but its obvious fault is that this definition considers only endpoints of interval numbers [44].We tested these distance definitions in IT2FCM* and proved that the definition proposed by Li et al. [45] is more effective than others.This definition is an extension of the definition proposed by Bertoluzza et al.
Let a = [a − , a + ] and b = [b − , b + ] be two interval numbers.Then, the interval distance between a and b is calculated as follows: As in IT2FCM, the lower and upper membership functions are constructed using two fuzzifiers: m 1 and m 2 .Then, two objective functions can be established by Equations ( 12) and ( 13): where d ik = x k − v i is the distance metric between the sample x k and the cluster centroid v i , which is the interval vector distance based on Equation (11) and can be expressed as follows: where x k = (x k1 , x k2 , . . . ,x km ) is a sample; v i is an interval number vector; l = 1, 2, . . . ,M; M is the number of features; C is the number of clusters; and N is the number of samples.These centroids are determined by the KM algorithm [46] during each iteration in IT2FCM*.Based on the IND methods and two different fuzziness parameters, i.e., m 1 and m 2 , the lower and upper membership grades of each sample can be expressed as follows: The iteration can be stopped when J t+1 m (U, v) − J t m (U, v) ≤ ε is satisfied.Then, the lower and upper membership grades of each sample belonging to each class are determined, and they conform to an interval number vector { u 1k , u 2k , . . . ,u Ck } = {[u 1k , u 1k ], [u 2k , u 2k ], . . . ,[u Ck , u Ck ]}.Then, the probability of any two intervals in the vector can be calculated as follows: where L( u ik ) = u ik − u ik and L u jk = u jk − u jk are the widths of the interval numbers u ik and u ik , respectively, for i, j = 1, 2, . . ., C and k = 1, 2, . . ., N.
We can then obtain a possibility matrix P = (p ij , k).Moreover, the ranking vector w k = (w 1k , w 2k , . . . ,w Ck ) T can be calculated by and the index of the maximum value in w k is the class index of the sample.

Methodology
As discussed above, the new spatial information and spectral indices are useful for remote-sensing data classification.In this section, we will incorporate these two types of dataset sources into the IT2FCM* and improve the IT2FCM* algorithm, which is called the EnIT2FCM* method here.

Spectral Indices
A spectral index is a ratio of two or more bands that have proven to be effective in LULC mapping.Unfortunately, some spectral indices have strong limitations on the data type or sensor type; for example, the worldview-2 sensor dataset has no SWIR band; thus, the AWEI indices cannot be used for it.We will review some commonly used indices in this section; a summary is provided in Table 1.
The MBI is selected in addition to the spectral indices in Table 1.These selected spectral indices can be classified into three types.NDVI, EVI and SAVI belong to vegetation indices (VIs); NDWI, MNDWI, AWEI nsh and AWEI sh belong to the water indices (WIs); and NDBI, NDBaI, and MBI belong to the build indices (BIs).Yang et al. reported that the combinations of FCM with WIs can essentially be divided into four scenarios, and the scenario in which the WIs are regarded as newly generated bands achieves a balance between simplicity and effectiveness [36].In this paper, we adopt this strategy to combine spectral indices.However, in addition to providing useful information, these spectral indices bring noise.For example, if we assume the threshold value is 0.05 for AWEI sh , a value greater than 0.05 will indicate a water body, which will enhance the classification, but a value less than 0.05 cannot indicate the correct land-cover type, resulting in noise.Although the most essential characteristic for discriminating a target object from its surrounding objects is the difference between their spectral curves, two problems are encountered.One is selecting which spectral indices or combinations to use in image clustering, and the other is setting the weights of these selected indices.This paper will propose a solution for these two problems in the EnIT2FCM* algorithm.

Thermal Infrared
In Table 1, BLU, GREEN, RED, NIR, SWIR1, SWIR2 and TIR are the reflectance values of blue, green red, nir, swir1, swir2 and thermal infrared band, respectively.

Spatial Information Measure
Usually, the spatial information in image classification mainly depends on both the local geometric relationship and the attributive relationship.The local geometric relationship is measured by the spatial Euclidean distances between the central pixel and its surrounding pixels in a local window; the closer the distance, the higher the similarity.The local attributive relationship is measured by the distances between the attributes of the central pixel and its surround pixels.Let the size of the local window be H = n × n.Then, we can define a comprehensive distance by combining the local spatial distance and attribute distance based on Equation (4): where d S kh is the spatial distance between central pixel k and surrounding pixel h in its neighbor window and d f kh is the distance based on the attribute or membership degree; the Euclidean distance is adopted in this paper.Our experiment shows that the attribute of a pixel or membership degree of a pixel belonging to a certain class can be used to measure the local attributive relationship, and a similar result can be produced.
Then, the spatial information measure can be defined as follows: where u ih and u ih are the lower and upper membership grades, respectively, of a pixel belonging to class i.

The enhanced IT2FCM*
When taking spectral indices and spatial information into account, the dataset X of images is constructed with the original dataset Y and the selected spectral indices Z.Then, the corresponding class centers can be expressed as A new distance dss ik between pixel k and centroid i is proposed, which combines spectral information, the spectral indices of itself and the spatial information from surrounding pixels.This new distance can be expressed as follows: where is the interval distance defined by Equation (4); more specifically, is the spectral attribute distance of a pixel k to the spectral attribute center of class i, and z k − v Z i is the spectral index distance of a pixel k to a spectral index center of class i. SI ik is the spatial information and is calculated by Equation (19), α ∈ [0, 1] is the parameter that controls the relative impact of the spatial information in the local window; if α = 0, the spatial information is not taken into account.The parameter β controls the effect of selected spectral indices.Theoretically, the selected spectral indices could be used for classification directly; thus, β is not limited to values less than 1.If α = 0 and β = 0 at the same time, then the algorithm reduces to the IT2FCM* algorithm.
This new algorithm is named the enhanced IT2FCM* (EnIT2FCM*) algorithm.Its process is shown in Figure 2 and described as Table 2.

Experimental Results and Discussion
In this section, we will use the Sentinel-2 dataset to test the EnIT2FCM* algorithm.The study area is classified into five types: building, bare land, grass, wood and water body.Here, the fuzzifiers and are set to 2.1 and 5, respectively.The effects of spatial information and spectral indices are discussed in the following, and their combined effect is also described.
and the termination criterion value ε.
1.4 Initialize the lower and upper membership grade matrix u = [u, u] using a random method.
Step 2. Compute all centroids of the original data and selected spectral indices and then calculate the spatial information and update the membership grade matrix.

Calculate all centroids of the original data
. ., C and determine their lower and upper bands v L i and v R i , respectively, via the KM algorithm.
2.2 Calculate the comprehensive distance using Equation ( 18) and spatial information using Equation (19).

Calculate the new distance between the pixel k and the centroid i using Equation (20).
2.4 Update the lower and upper membership grade matrix u = [u, u] using Equations ( 15) and ( 16).
Step 3. Classify each sample using the interval number ranking method.
3.1 Calculate the possibility matrix using Equation (16) and then obtain the ranking vector.
3.2 Assign a sample to a cluster.

Experimental Results and Discussion
In this section, we will use the Sentinel-2 dataset to test the EnIT2FCM* algorithm.The study area is classified into five types: building, bare land, grass, wood and water body.Here, the fuzzifiers m 1 and m 2 are set to 2.1 and 5, respectively.The effects of spatial information and spectral indices are discussed in the following, and their combined effect is also described.

Data Set and Study Area
The study area is located in southwestern Tianjin City, a metropolis in northern coastal Mainland China and one of the five national central cities of the country, with a latitude ranging from 38 • 34 to 40 • 15 N and a longitude ranging from 116 • 43 to 118 • 04 E (see Figure 3).Three universities are included within the study area: Tianjin Normal University, Tianjin Poly Technique University, and the Tianjin University of Technology.The reason for choosing this region as the study area for this research is mainly the convenience for validation.The land cover of this study area is mainly composed of buildings, ponds, trees, grassland, trails, and concrete roads.In this paper, a cloud-free multispectral high-resolution image from the Sentinel-2 satellite, which was acquired on 28 August 2016, is used for classification to test the classification ability of the improved EnIT2FCM* algorithm presented in this paper.Sentinel-2A was launched on 23 June 2015.It is a polar-orbiting, multispectral high-resolution imaging mission for land monitoring to provide, for example, imagery of vegetation, soil and water cover, inland waterways and coastal areas.The spectral region ranges from 0.44 to 2.2 µm, with 13 spectral channels in the visible, near-infrared, and short-wave-infrared parts of the spectrum.Among these 13 bands, there are 3 channels, which are designed for detecting coastal aerosol (0.443 µm), water vapor (0.945 µm), and cirrus (1.375 µm), with a spatial resolution of 60 m.The spatial resolution of visible bands 2 to 4 (spectral region ranging from 0.490 to 0.665 µm) and NIR band 8 (0.842 µm) is 10 m, and the spatial resolution of bands 5 to 7 (ranging from 0.705 to 0.783 µm), band 8 (0.865 µm) and bands 11 and 12 (with wavelengths of 1.610 µm and 2.190 µm, respectively) is 20 m.The acquired dataset is a Level-1C product, which is the top-of-atmosphere (TOA) radiance data.This product is currently processed using a processor running on European Space Agency's (ESA's) Sentinel-2 Toolbox.Before the classification, the necessary preprocessing was carried out for the multispectral images to convert the Level-1C product into a Level-2A product, which is the bottom-of-atmosphere (BOA) reflectance data, for further classification.This step is mainly performed by the SNAP software and its plug-in component CEN2COR, which are provided on the official website of the ESA.In this paper, a cloud-free multispectral high-resolution image from the Sentinel-2 satellite, which was acquired on 28 August 2016, is used for classification to test the classification ability of the improved EnIT2FCM* algorithm presented in this paper.Sentinel-2A was launched on 23 June 2015.It is a polar-orbiting, multispectral high-resolution imaging mission for land monitoring to provide, for example, imagery of vegetation, soil and water cover, inland waterways and coastal areas.The spectral region ranges from 0.44 to 2.2 µm, with 13 spectral channels in the visible, near-infrared, and short-wave-infrared parts of the spectrum.Among these 13 bands, there are 3 channels, which are designed for detecting coastal aerosol (0.443 µm), water vapor (0.945 µm), and cirrus (1.375 µm), with a spatial resolution of 60 m.The spatial resolution of visible bands 2 to 4 (spectral region ranging from 0.490 to 0.665 µm) and NIR band 8 (0.842 µm) is 10 m, and the spatial resolution of bands 5 to 7 (ranging from 0.705 to 0.783 µm), band 8 (0.865 µm) and bands 11 and 12 (with wavelengths of 1.610 µm and 2.190 µm, respectively) is 20 m.The acquired dataset is a Level-1C product, which is the top-of-atmosphere (TOA) radiance data.This product is currently processed using a processor running on European Space Agency's (ESA's) Sentinel-2 Toolbox.Before the classification, the necessary preprocessing was carried out for the multispectral images to convert the Level-1C product into a Level-2A product, which is the bottom-of-atmosphere (BOA) reflectance data, for further classification.This step is mainly performed by the SNAP software and its plug-in component CEN2COR, which are provided on the official website of the ESA.

The Effect of Spatial Information
The main work of this subsection is to perform a comparative analysis between the spatial information definition (FGFCM) of Cai et al. [18], the definition (IIT2-FCM) of Ngo et al., [10] and the definition (IT2FCM*_S) proposed in the present paper.The size of the local window is 3 × 3. Four typical validity indices for type-1 fuzzy clustering are selected in this research: the partition coefficient (PC-), the partition entropy (PE-), the Fukuyama and Sugeno index (FS-), and the Xie and Beni index (XB-) [51].During the process, the parameter α is set to 1 and the corresponding error coefficient is set to 0.001.We know that the value of PC-indicates the average relative amount of membership sharing between pairs of fuzzy subsets.The higher the PC-value is, the better the corresponding classification results will be.PE-is a scalar measure of the amount of fuzziness in a set of results, and FS-is designed to measure the discrepancy between fuzzy compactness and fuzzy separation.XB-is used to measure the average within-cluster fuzzy compactness against the minimum between-cluster separation.The values of these three indices are smaller, indicating better clustering performance of these clustering methods.
In Table 3, comparing the indices of FCM, IFCM, IT2FCM*, FGFCM, IIT2-FCM and IT2FCM*_S, all of these spatial information definitions improvthe corresponding results.IT2FCM* with spatial information yields the maximum value of PC-and the minimum values of PE-and FS-.Although the XB-value of IT2FCM*_S is not the smallest, it still indicates that spatial information improves the performance of EnIT2FCM*.Figure 4 shows the classification results of these methods; the pictures also illustrate this conclusion.

The Effect of Spatial Information
The main work of this subsection is to perform a comparative analysis between the spatial information definition (FGFCM) of Cai et al. [18], the definition (IIT2-FCM) of Ngo et al., [10]and the definition (IT2FCM*_S) proposed in the present paper.The size of the local window is 3 × 3. Four typical validity indices for type-1 fuzzy clustering are selected in this research: the partition coefficient (PC-), the partition entropy (PE-), the Fukuyama and Sugeno index (FS-), and the Xie and Beni index (XB-) [51].During the process, the parameter is set to 1 and the corresponding error coefficient is set to 0.001.We know that the value of PC-indicates the average relative amount of membership sharing between pairs of fuzzy subsets.The higher the PC-value is, the better the corresponding classification results will be.PE-is a scalar measure of the amount of fuzziness in a set of results, and FS-is designed to measure the discrepancy between fuzzy compactness and fuzzy separation.XB-is used to measure the average within-cluster fuzzy compactness against the minimum between-cluster separation.The values of these three indices are smaller, indicating better clustering performance of these clustering methods.
In Table 3, comparing the indices of FCM, IFCM, IT2FCM*, FGFCM, IIT2-FCM and IT2FCM*_S, all of these spatial information definitions improve the corresponding results.IT2FCM* with spatial information yields the maximum value of PC-and the minimum values of PE-and FS-.Although the XB-value of IT2FCM*_S is not the smallest, it still indicates that spatial information improves the performance of EnIT2FCM*.Figure 4 shows the classification results of these methods; the pictures also illustrate this conclusion.

The Effect of Spectral Indices
In this section, we will test the performance of spectral indices in EnIT2FCM*.As noted above, spectral indices may provide useful information as well as noise.Selection of suitable spectral indices is crucial.The strategy adopted in this paper includes three steps.The first step is to calculate the spectral indices listed in Table 1; the results are shown in Figure 5.The second step is to integrate these indices into Equation (20) one by one and test the validity indices PC-, PE-, XB-and FS-; the results are shown in Figure 6.

Bare land
Wood Building Grass Water

The Effect of Spectral Indices
In this section, we will test the performance of spectral indices in EnIT2FCM*.As noted above, spectral indices may provide useful information as well as noise.Selection of suitable spectral indices is crucial.The strategy adopted in this paper includes three steps.The first step is to calculate the spectral indices listed in Table 1; the results are shown in Figure 5.The second step is to integrate these indices into Equation (20) one by one and test the validity indices PC-, PE-, XB-and FS-; the results are shown in Figure 6.For these validity indices, when NDVI is used, the maximum value of PC-can be achieved; when the AWEInsh is used, the minimum value of PC-can be achieved; the maximum and minimum values of PE-can be achieved when AWEInsh is used; and the minimum value of PE-can be achieved when NDVI is used.The maximum and minimum values of XB-can be achieved using NDVI and EVI, respectively.The maximum and minimum values of FS-can be achieved using NDBI and EVI, respectively.Regarding the VIs, the results of EVI and SAVI are similar, while NDVI achieves perfect values of PC-and PE-but poor values of XB-and FS-.Regarding the WIs, AWEIsh is superior to the others overall.Regarding the BIs, MBI is better than NDBI and NDBaI, and NDBI has the worst performance of all of these spectral indices.The classification results of EnIT2FCM* combined with a single spectral index are shown in Figure 7.In Figure 7, we can see that two vegetation indices (EVI and SAVI) improve the classification very well, as shown in Figure 7a,b.However, NDWI performs poorly (Figure 7c); we can see that the river is classified as bare land, and some buildings are also classified as bare land.AWEIsh and MBI can improve classification but still have some problems; for example, some buildings are classified as bare land by AWEIsh, and bare land is wrongly classified as wooded areas, as shown in Figure 7d,e, respectively.
Next, a variety of combinations of different indices are built and integrated into Equation (20) one by one to test the PC-and PE-.It should be noted that some validity indices, such as FS-and XB-, cannot be used because their dimensions are not equal; therefore, the confusion matrix and kappa coefficient are adopted here.We tested more than 30 representative combinations among them according to the analysis of Step 1, and then eight combinations are selected for discussion here.We can see that Combination 6 (see column ID in Table 4) has the best validity effect because it has the greatest PC-value and minimum PE-value; however, the overall accuracy and kappa coefficient values are poor.Combination 1 has the optimal user accuracy and kappa coefficient value, but its PC-and PE-values are not very good.Combination 7 has the worst performance among these indices, which indicates that the excessive combination of spectral indices does not improve the classification accuracy.To some extent, it may degrade the classification accuracy.Combination 1 has the best performance overall in terms of accuracy and kappa coefficient and relatively better performance in terms of PC-and PE-; thus, in this step, the combination of SAVI and AWEIsh is regarded as the best combination, as shown in Figure 8a.The difference between Combinations 1 and 4 is that Combination 4 has one more spectral index, MBI, than Combination 1.However, the performance of these indices is not as good as that of combination 1, and the same situation occurs with Combination 5. Thus, we conclude that combining suitable spectral indices in IT2FCM* will improve the classification performance.For these validity indices, when NDVI is used, the maximum value of PC-can be achieved; when the AWEI nsh is used, the minimum value of PC-can be achieved; the maximum and minimum values of PE-can be achieved when AWEI nsh is used; and the minimum value of PE-can be achieved when NDVI is used.The maximum and minimum values of XB-can be achieved using NDVI and EVI, respectively.The maximum and minimum values of FS-can be achieved using NDBI and EVI, respectively.Regarding the VIs, the results of EVI and SAVI are similar, while NDVI achieves perfect values of PC-and PE-but poor values of XB-and FS-.Regarding the WIs, AWEI sh is superior to the others overall.Regarding the BIs, MBI is better than NDBI and NDBaI, and NDBI has the worst performance of all of these spectral indices.The classification results of EnIT2FCM* combined with a single spectral index are shown in Figure 7.In Figure 7, we can see that two vegetation indices (EVI and SAVI) improve the classification very well, as shown in Figure 7a,b.However, NDWI performs poorly (Figure 7c); we can see that the river is classified as bare land, and some buildings are also classified as bare land.AWEI sh and MBI can improve classification but still have some problems; for example, some buildings are classified as bare land by AWEI sh , and bare land is wrongly classified as wooded areas, as shown in Figure 7d,e, respectively.
Next, a variety of combinations of different indices are built and integrated into Equation (20) one by one to test the PC-and PE-.It should be noted that some validity indices, such as FS-and XB-, cannot be used because their dimensions are not equal; therefore, the confusion matrix and kappa coefficient are adopted here.We tested more than 30 representative combinations among them according to the analysis of Step 1, and then eight combinations are selected for discussion here.We can see that Combination 6 (see column ID in Table 4) has the best validity effect because it has the greatest PC-value and minimum PE-value; however, the overall accuracy and kappa coefficient values are poor.Combination 1 has the optimal user accuracy and kappa coefficient value, but its PC-and PE-values are not very good.Combination 7 has the worst performance among these indices, which indicates that the excessive combination of spectral indices does not improve the classification accuracy.
To some extent, it may degrade the classification accuracy.Combination 1 has the best performance overall in terms of accuracy and kappa coefficient and relatively better performance in terms of PCand PE-; thus, in this step, the combination of SAVI and AWEI sh is regarded as the best combination, as shown in Figure 8a.The difference between Combinations 1 and 4 is that Combination 4 has one more spectral index, MBI, than Combination 1.However, the performance of these indices is not as good as that of combination 1, and the same situation occurs with Combination 5. Thus, we conclude that combining suitable spectral indices in IT2FCM* will improve the classification performance.

Combination Effect of Spatial Information and Spectral Indices
In Section 4.2, we tested the effects of spatial information on four validity indices and showed that the spatial information measure proposed in this paper achieves good performance.In Section 4.3, we found that EnIT2FCM* with the combination of SAVI and AWEIsh has the best performance.In this subsection, we will test the combination effect of spatial information and spectral indices SAVI and AWEIsh; the results are shown in Figure 8b.Four validity indices, namely, PC-, PE-, XBand FS-, along with the accuracy and kappa coefficient are adopted here.We set = 1 and = 1, and other parameters are set to the same values as before.PC-, PE-, XB-and FS-are equal to 0.301, 1.39, 0.308 and −1.16E7, respectively, when considering the effect of these two indices.When the spatial information and these two spectral indices are used in the EnIT2FCM* algorithm at the same time, values of the validity indices improve: 0.313, 1.37, 0.278 and −1.58E7, respectively.The overall accuracy and kappa coefficient are also improved, from 87.92% and 0.843 to 88.02% and 0.844, respectively.

Discussion
In this section, an experiment is implemented to test the EnIT2FCM*.The spatial information definition proposed in this paper combines spatial distance and attribute distance, and it has been shown to have very good performance, so it is more comprehensive than the method proposed by Ngo et al. [10].Although some other definitions consider the spatial information and attribution information, and the attribute information is based on pixel value, e.g., EnFCM [17], FGFCM [18], and NDFCM [19].When these methods are used for multispectral or hyperspectral remote sensing classification, the number of bands is relatively large, then the amount of calculation may be Bare land Wood Building Grass Water

Combination Effect of Spatial Information and Spectral Indices
In Section 4.2, we tested the effects of spatial information on four validity indices and showed that the spatial information measure proposed in this paper achieves good performance.In Section 4.3, we found that EnIT2FCM* with the combination of SAVI and AWEI sh has the best performance.In this subsection, we will test the combination effect of spatial information and spectral indices SAVI and AWEI sh ; the results are shown in Figure 8b.Four validity indices, namely, PC-, PE-, XB-and FS-, along with the accuracy and kappa coefficient are adopted here.We set α = 1 and β = 1, and other parameters are set to the same values as before.PC-, PE-, XB-and FS-are equal to 0.301, 1.39, 0.308 and −1.16 × 10 7 , respectively, when considering the effect of these two indices.When the spatial information and these two spectral indices are used in the EnIT2FCM* algorithm at the same time, values of the validity indices improve: 0.313, 1.37, 0.278 and −1.58 × 10 7 , respectively.The overall accuracy and kappa coefficient are also improved, from 87.92% and 0.843 to 88.02% and 0.844, respectively.

Discussion
In this section, an experiment is implemented to test the EnIT2FCM*.The spatial information definition proposed in this paper combines spatial distance and attribute distance, and it has been shown to have very good performance, so it is more comprehensive than the method proposed by Ngo et al. [10].Although some other definitions consider the spatial information and attribution information, and the attribute information is based on pixel value, e.g., EnFCM [17], FGFCM [18], and NDFCM [19].When these methods are used for multispectral or hyperspectral remote sensing classification, the number of bands is relatively large, then the amount of calculation may be relatively large.Thus, the definition proposed in this paper provides another choice in these situations.In other words, we can use the membership value to calculate the attribute distance if the number of classification is smaller than the number of bands.
As mentioned before, few works consider spectral indices in FCMs for remote sensing classification and Yang et al. [36] has made a good start, but this work only considered the water indices.This paper makes a full investigation about common vegetation indices, water indices and build indices, the experiment shows that EVI, SAVI and AWEIsh have better performance in these indices, and then the combination of SAVI and AWEIsh has been discovered that it has the best performance than single spectral index or other combinations in this experiment.
When taking the combination of spatial information and spectral indices into account, this combination still improves the classification accuracy, although the magnitude of the improvement is not particularly obvious.However, overall, the combination effect of spatial information and suitable spectral indices with the modified IT2FCM* improves the classification accuracy significantly.

Conclusions
Classification of remote-sensing data is crucial in remote-sensing domain, especially in the era of big data.Spectral attributes, spectral indices and spatial information are important for classification.Usually, spectral attributes and spectral indices can be used to classify or segment a remote-sensing image individually, and the spatial information of an image is usually used to improve the performance.This paper attempts to incorporate these factors into the classification algorithm.First, a new distance metric is defined for this target, after which the EnIT2FCM* is proposed, which is a type of unsupervised classification method and an extension of IT2FCM*.Experimental results showed that the spatial information definition proposed here is effective and improves the classification performance.Some spectral indices and their combinations improve the performance of this enhanced IT2FCM*.The combination of SAVI and AWEI sh is the best choice of spectral indices in this method.However, an improper spectral index will degrade the classification accuracy; therefore, the selection of suitable spectral indices is crucial in this method.The combination effect of spatial information and suitable spectral indices with the EnIT2FCM* improves the classification accuracy significantly.
The distance metric has a crucial impact on the FCM.As mentioned in Section 1, the spectral characteristic curves of nearly all geographical features should be bands with certain ranges, and the centers of each class are defined as interval vectors in the IT2FCM* and enhanced IT2FCM*; therefore, the measurement of the distances from a pixel to these interval vectors is still a problem.Thus, in future research, we will investigate different distance metrics for these algorithms.In addition, the optimal spectral index may be different for different remote-sensing data.Existing spectral indices are generally induced from Landsat datasets; whether these spectral indices can be applied to other remote-sensing data, such as WorldView-3, needs to be studied in depth, and this problem is still crucial for the EnIT2FCM*.Therefore, determining the validity of spectral indices for different types of remote-sensing data and possibly defining new spectral indices for different types of remote-sensing data are among our future research priorities.

Figure 1 .
Figure 1.Diagram of the technical approach of this paper.

Figure 1 .
Figure 1.Diagram of the technical approach of this paper.

Figure 5 .
Figure 5. Spectral indices of the research area.Figure 5. Spectral indices of the research area.

Figure 5 .
Figure 5. Spectral indices of the research area.Figure 5. Spectral indices of the research area.

Figure 7 .
Figure 7. Classification results of EnIT2FCM* by combining single spectral indices.

Figure 7 .
Figure 7. Classification results of EnIT2FCM* by combining single spectral indices.

Figure 8 .
Figure 8. Classification results of EnIT2FCM* by combining with SAVI and AWEIsh.

Figure 8 .
Figure 8. Classification results of EnIT2FCM* by combining with SAVI and AWEI sh .

Table 1 .
Some commonly used spectral indices.

Table 2 .
The process of the EnIT2FCM*.

Table 4 .
Results of PC-, PE-and accuracy of enhanced IT2FCM* combining different spectral indices.

Table 4 .
Results of PC-, PE-and accuracy of enhanced IT2FCM* combining different spectral indices.