Exploring Fuzzy Local Spatial Information Algorithms for Remote Sensing Image Classiﬁcation

: Fuzzy c-means (FCM) and possibilistic c-means (PCM) are two commonly used fuzzy clustering algorithms for extracting land use land cover (LULC) information from satellite images. However, these algorithms use only spectral or grey-level information of pixels for clustering and ignore their spatial correlation. Different variants of the FCM algorithm have emerged recently that utilize local spatial information in addition to spectral information for clustering. Such algorithms are seen to generate clustering outputs that are more enhanced than the classical spectral-based FCM algorithm. Nonetheless, the scope of integrating spatial contextual information with the conventional PCM algorithm, which has several advantages over the FCM algorithm for supervised classiﬁcation, has not been explored much. This study proposed integrating local spatial information with the PCM algorithm using simpler but proven approaches from available FCM-based local spatial information algorithms. The three new PCM-based local spatial information algorithms: Possibilistic c-means with spatial constraints (PCM-S), possibilistic local information c-means (PLICM), and adaptive possibilistic local information c-means (ADPLICM) algorithms, were developed corresponding to the available fuzzy c-means with spatial constraints (FCM-S), fuzzy local information c-means (FLICM), and adaptive fuzzy local information c-means (ADFLICM) algorithms. Experiments were conducted to analyze and compare the FCM and PCM classiﬁer variants for supervised LULC classiﬁcations in soft (fuzzy) mode. The quantitative assessment of the soft classiﬁcation results from fuzzy error matrix (FERM) and root mean square error (RMSE) suggested that the new PCM-based local spatial information classiﬁers produced higher accuracies than the PCM, FCM, or its local spatial variants, in the presence of untrained classes and noise. The promising results from PCM-based local spatial information classiﬁers suggest that the PCM algorithm, which is known to be naturally robust to noise, when integrated with local spatial information, has the potential to result in more efﬁcient classiﬁers capable of better handling ambiguities caused by spectral confusions in landscapes.


Introduction
Land use/land cover (LULC) maps are the most useful product derived from remote sensing, as this information is vital for other applications, such as assessment and monitoring of vegetation types, crop and yield estimation, environmental impact assessment, natural resource management and monitoring, urban planning. Scientists are continually coming up with advanced classification techniques with improved classification accuracy to generate precise LULC maps. A particular classifier may not work best for all situations, as the characteristics of each image and the circumstances of each study vary greatly. For that reason, an appropriate classifier should be selected by the analyst for the task at hand [1].
Most classifiers used for satellite image analysis are based on the per-pixel approach. With conventional per-pixel classifiers or hard classifiers, the study area is assumed to consist of distinct and discrete LULC classes, which are internally homogenous [2]. The image pixels that represent geographic information on the ground are associated with a single land use or land cover class. However, a single pixel may correspond to more than one land cover in reality, such as a soil pixel with sparse grassland, which can be classified as either "grassland" or "soil" [3]. Such mixed pixels, which show affinity to more than one information class, may also occur at the indistinct class boundaries. Furthermore, ambiguities may arise due to variability within land cover classes and variation in spectral responses recorded by the sensors with corresponding ground situations. The uncertainty or fuzziness in geographic representation due to class mixtures, intermediate conditions, or within-class variability is better represented by soft classification or sub-pixel methods.
Sub-pixel scale analysis assumes an individual pixel's spectral value to be a combination of spectral values of pure pixels from classes, i.e., the pixels are assumed to be mixed pixels. The sub-pixel scale thematic information is usually represented by soft classification outputs where the areal class proportion of each pixel is displayed, and hence a set of output fraction images is generated, one per class [4]. Such methods provide more accurate geographic representation and more truthful results, especially in the case of coarser datasets [5]. Various methods have been explored to derive soft classification outputs, a few of which are the approaches based on fuzzy-set theory, Dempster-Shafer theory, certainty factor, the softening of hard classifier outputs, neural networks, regression modeling, etc. [5,6]. Fuzzy-set-based approaches and spectral mixture analysis are the popular, simpler techniques for dealing with the mixed-pixel problem in remote sensing image classification [5]. Fuzzy set theory [7] introduced the idea of partial membership of data points in multiple clusters characterized by membership functions. The class membership values of a pixel in fuzzy-based classifiers indicate the sub-pixel scale fractional class compositions of that pixel.
One common issue in the standard unmixing or fuzzy-based soft classification methods is the requirement of defining all the classes (endmembers) [8]. As a result of this, the soft classification outputs generated from these methods depict relative measures of class membership of pixels as opposed to the absolute strength of class membership. Fuzzy c-means (FCM) [9,10], originally an unsupervised fuzzy clustering algorithm, has been successfully implemented and widely used in the field of remote sensing, both for unsupervised and supervised classifications [2,[11][12][13][14][15][16]. However, FCM generates relative class membership of pixels, which means the membership of a pixel (or a data point) in a cluster is dependent on its membership in all remaining clusters [17]. Krishnapuram and Keller [18] observed the performance of FCM and its derivatives to be compromised in noisy environments because of the probabilistic constraint, which forces the membership values in FCM to be relative, and hence presented a possibilistic approach to clustering in the possibilistic c-means (PCM) algorithm. PCM is an iterative clustering algorithm similar to the FCM, but the membership of a data point (pixel) in a cluster is independent of its membership in other clusters. The membership in the PCM algorithm, therefore, represents the "degree of belonging or compatibility" as opposed to the "degree of sharing" in FCM [18]. PCM is naturally immune to noise and outliers because the noise pixels will have a low degree of compatibility in all clusters. Foody [17] suggested that PCM is more appropriate for supervised LULC classification, as untrained classes are generally encountered during supervised classification. While FCM generates the most accurate class compositions when all the classes are defined, its performance degrades in the presence of untrained classes. PCM is not affected by the presence of untrained classes and thus outperforms FCM in such scenarios [17]. Ibrahim et al. [16], on evaluating the performance of the maximum likelihood classifier (MLC), FCM, and PCM for supervised classification, found PCM to produce the highest accurate land cover maps when uncertainty existed in the dataset.
The fuzzy clustering algorithms FCM and PCM do not accommodate the spatial dependence of pixels in the input satellite image for classification [19]. The use of spectral information alone for classification often generates classification outputs that look noisy (salt and pepper effect) [20]. This is caused due to the diversity of spectral variability, inherent ground complexities, or inadequate spatial resolution [21]. Medium resolution satellite sensors, such as Landsat Thematic Mapper, cover a large area and produce images in which "the pixel size is smaller than the general extent of landscape objects" [22]. Thus, pixels within images "exhibit a high degree of spatial autocorrelation" [22], which means the pixels that are close together or that are within the local neighborhood are likely to belong to the same information class. The suitable use of this spatial information along with spectral information could help to eliminate ambiguities caused due to spectral confusions in LULC classes (intra-class spectral variation and inter-class spectral similarity), to recover the missing information, and to correct erroneous pixel classifications [5,23].
Spatial context in remote sensing image analysis can be characterized by texture extraction, filtering methods, mathematical morphology, spatial statistics, contextual techniques (segmentation and object-based image analysis), Markov random fields (MRF), relaxation labeling, the fusion of multisource data/techniques, etc. [5,6,22,[24][25][26][27]. Numerous models have come up, which fuses multiple data/methods for taking full advantage of the spatial knowledge in the images for improvising the classification efficiency. Pixel-level image fusion approaches, generally categorized into multi-scale decomposition and sparse representations, are widely applied fusion approaches in remote sensing [28,29]. Anand et al. [30] used a 3D-DWT (discrete wavelet transform) for feature extraction from hyperspectral data and fed the extracted features to three machine learning (ML) algorithms, Random Forest (RF), Support Vector Machine (SVM), and K-Nearest neighbor (KNN), for classification. 3D-DWT, a multi-scale decomposition approach, incorporates the association of neighboring pixels for extracting discriminatory features while minimizing the noise in the input hyperspectral image (HSI) data. The three ML algorithms with the added 3D-DWT features performed better than the corresponding traditional version of the algorithms. Miao and Shi [20] proposed a multistep spectral-spatial method for HSI and multispectral image classification. This classification method utilized the fusion of statistical region merging (SRM) outputs and pixel-based SVM outputs using majority voting to obtain the spectral-spatial classifications.
Spatial contextual information is commonly incorporated in the spectral-based classification as pre-classification/post-classification steps or as additional feature bands during classification without modifying the classifiers. The per-pixel classifiers (SVM, RF) and subpixel classifiers (FCM) are also seen to be integrated with approaches, such as MRF and relaxation labeling, and others to simultaneously exploit the spatial and spectral information during classification [31][32][33][34]. The concept of local (neighborhood) spatial autocorrelation has emerged recently to integrate spatial information with conventional spectral-based per-pixel or sub-pixel classifiers [6,35,36]. Zhang et al. [35] modified and integrated the k-means algorithm with a neighborhood constrained index to generate a neighborhood-constrained k-means (NC-k-means) algorithm. Deng and Wu [36] came up with a spatially adaptive spectral mixture analysis (SASMA), which involved the integration of spatial and spectral information, unlike the conventional spectral mixture analysis (SMA). All the above spectral classifiers showed improved classification performance when integrated with spatial contextual information.
Deep learning approaches, especially the convolutional neural networks (CNN), have recently gained popularity because of their ability to exploit spatial data through convolution operation for feature extraction, segmentation, object detection, classification, etc. However, the 3D CNN that extracts both spectral and spatial information has high computational complexity. Hence, 2D CNNs are generally widely employed to extract spatial features (e.g., textures) from images [37]. Several studies have combined the benefits of fuzzy classification approaches with that of CNN. Balakrishnan et al. [38] proposed a meticulous fuzzy convolution c-means (MFCCM) algorithm that integrated a fuzzy clustering algorithm with CNN. Although the MFCCM algorithm utilized the significant features extracted from convolutional filtering to generate promising fuzzy clustering results in noisy environments, the algorithm exhibited limitations in terms of time complexity. Wang et al. [39] proposed a semi-supervised approach combining an improved FCM algorithm (IFCM), which effectively utilized labeled data integrated into traditional FCM for clustering, and CNN for fault detection in the sparsity of labeled data. The CNN model in this method, trained using the original data corresponding to the output labeled data from IFCM, was further used for fault detection. Although deep learning models are effective in handling complex classification scenarios by incorporating advanced learning mechanisms, large training data is required to train them to produce generalized outcomes.
The most common and simplest approach to integrating local spatial information into conventional FCM is by modifying the FCM objective function to include spatial information from the neighboring pixels within a local window. The advantage of such a method is that the spatial information can be straightforwardly incorporated into the FCM with minimum changes in the resulting implementation [19]. Integrating spatialcontextual information with spectral information in FCM has been seen to improve its robustness and classification accuracy [19,21,[40][41][42][43][44][45][46]. Pham and Prince [45] came up with an iterative adaptive FCM (AFCM) algorithm that included a multiplier field term to model the brightness variations caused due to intensity inhomogeneities in magnetic resonance images. Liew et al. [46] experimented with modifying the FCM objective function with a new adaptive dissimilarity index that takes into account the influence of neighboring pixels on the central pixel within the 3 × 3 window around the pixel. Another study by Pham [19] introduced a spatial penalty term in the objective function of FCM to constrain the behavior of membership functions, keeping the centroid computations of FCM unchanged in this approach. Ahmed et al. [42], in the FCM-S algorithm, introduced a spatial neighborhood term into the conventional FCM algorithm that forced the labeling of the neighboring pixels to influence the labeling of each pixel. One limitation of this algorithm, an increase in computation time, was overcome in FCM_S1 and FCM_S2 proposed by S. Chen and Zhang [43]. FCM_S1 and FCM_S2 used a mean filter and a median filter, respectively, to simplify the computation complexity. However, the output performances of FCM-S, FCM-S1, and FCM-S2 algorithms depended on an empirically selected parameter, which controlled the robustness to noise and the effectiveness of preserving the image details. Krinidis and Chatzis [44] proposed a fuzzy local information c-means (FLICM) algorithm in which a new fuzzy factor, independent of any empirically selected parameter, was introduced into the FCM objective function. The FLICM algorithm was observed to be weak in identifying class boundary pixels and edge pixels as a result of over-smoothing [21,47]. H. Zhang et al. [21] introduced a fuzzy similarity measure based on a spatial attraction model in their adaptive fuzzy local information c-means algorithm (ADFLICM). The fuzzy similarity measure allowed each pixel during classification to be influenced by its neighboring pixels as well as its own features, thereby ensuring edge retention and noise insensitivity simultaneously. Mishra et al. [48] applied an improved FLICM algorithm as a segmentation method to extract features that were subsequently fed into the local linear wavelet neural network (LLWNN-SCA) model for breast cancer detection and classification. Suman et al. [49] performed a comparative analysis of fuzzy local spatial information algorithms, FCM-S, FLICM, and ADFLICM, with different distance measures and weighing factors.
Although the PCM algorithm is not as researched as FCM, numerous modified versions of PCM have come up recently to improve the clustering/classification performance of the conventional PCM algorithm to suit different applications [50][51][52][53][54][55][56]. In separate studies carried out by Ravindraiah and Chandra Mohan Reddy [55] and Chawla [56], the PCM algorithm was adapted to include spatial contextual information with spectral information for clustering/classification. Ravindraiah and Chandra Mohan Reddy [55] modified the PCM algorithm with induced spatial constraint to develop a spatial possibilistic c-means clustering algorithm (SPCM) for fundus image classification for diabetic retinopathy (DR) lesions. Chawla [56] developed contextual fuzzy classification algorithms using MRF with PCM as the base classifier. While the contextual PCM classifier with MRF standard regularization and the contextual PCM classifier with MRF discontinuity adaptive (DA) prior outperformed the conventional PCM classifier in terms of accuracy, the contextual PCM with DA prior preserved edges better.
The promising results of the local spatial information algorithms, FCM-S, FLICM, and ADFLICM, and the advantages of PCM over FCM in the presence of untrained classes [17], encouraged us to integrate local spatial information with the standard PCM algorithm and to investigate these algorithms for supervised LULC classification in soft (fuzzy) mode. The possibilistic c-means with spatial constraints (PCM-S), possibilistic local information c-means (PLICM), and adaptive possibilistic local information c-means (ADPLICM) algorithms are proposed, exploiting the methods used for incorporating local spatial information into the FCM-S, FLICM, and ADFLICM algorithms, respectively. The realization of the above-mentioned approaches involves simple modifications to the PCM objective function to integrate spatial data straightforwardly. With appropriate parameter values, the PCM algorithm has been proven to improve the results of FCM [57] and is recommended. Thus, we incorporate local spatial information into PCM with a view of generating better classifiers that could be used instead of, or in addition to, the FCM-based local spatial information classifiers, FCM, or PCM classifiers. The study aims not only to incorporate local spatial information into the PCM algorithm but to evaluate and compare the performance of available FCM-based local spatial information algorithms and the new PCM-based local spatial information algorithms. The contribution of our study lies in (1) understanding the potential of integrating the PCM algorithm with local spatial information for better handling of isolated pixels (noise) and other ambiguities caused by spectral confusions and (2) understanding the relevance of the choice of the fuzzy local spatial information classifiers and their parameter values, in managing outliners and generating accurate classification outputs, based on the number and nature of land cover classes for classification.

Study Area and Datasets
A multispectral satellite image acquired by Landsat-8 on 12 February 2015, with a spatial resolution of 30 m, was used for image classification (Figure 1b). The image covers an area located in Haridwar district of Uttarakhand state, India, extending from latitudes 29 • 48 48 N to 29 • 53 14 N and longitudes 78 • 9 56 E to 78 • 14 43 E (Figure 1a). A finer resolution Formosat-2 image of the same area with a spatial resolution of 8 m, acquired on 21 February 2015, was used for generating reference membership fraction images (Figure 1c). The area is diverse in terms of LULC classes present. Six land cover classes, namely, Dense Forest, Eucalyptus, Grassland, Riverine Sand, Wheat, and Water, were selected for our study. A training and test dataset consisting of random pixels from the Landsat-8 image was generated with the help of available field data.

Classification Algorithms
Theoretical explanations of the conventional fuzzy clustering algorithms FCM, and PCM, along with the details of FCM-based local spatial information algorithms (FCM-S, FLICM, and ADFLICM), and the proposed adaptations of these algorithms with PCM viz PCM-S, PLICM, and ADPLICM, respectively, are provided in the subsequent sub-sections. The algorithm descriptions are limited to an estimation of class membership values from supervised versions of these algorithms.

Fuzzy c-Means (FCM)
The FCM [9,10] is originally an unsupervised clustering algorithm, which finds fuzzy partitions and prototypes by minimizing the objective function. The optimal fuzzy clusters are obtained by minimizing the objective function in Equation (1).
The FCM objective function is subjected to the constraints in Equation (2).
where, is the matrix of cluster centres with its elements denoted by ; is the mean vector for cluster i; is a × fuzzy partition matrix representing the membership values of pixels per cluster; is the membership value of jth pixel for cluster i; is the total number of pixels; is the number of clusters; is the fuzzy weight, which controls the level of fuzziness and its value lies between 1 and infinity (the value tending to infinity produces absolutely fuzzy outputs while tending to unity produces hard or crisp outputs); , is the Euclidean distance (dij 2 ) between the jth pixel value and cluster mean .

Classification Algorithms
Theoretical explanations of the conventional fuzzy clustering algorithms FCM, and PCM, along with the details of FCM-based local spatial information algorithms (FCM-S, FLICM, and ADFLICM), and the proposed adaptations of these algorithms with PCM viz PCM-S, PLICM, and ADPLICM, respectively, are provided in the subsequent sub-sections. The algorithm descriptions are limited to an estimation of class membership values from supervised versions of these algorithms.

Fuzzy c-Means (FCM)
The FCM [9,10] is originally an unsupervised clustering algorithm, which finds fuzzy partitions and prototypes by minimizing the objective function. The optimal fuzzy clusters are obtained by minimizing the objective function in Equation (1).
The FCM objective function is subjected to the constraints in Equation (2).
where, V is the matrix of cluster centres with its elements denoted by v i ; v i is the mean vector for cluster i; U is a C × N fuzzy partition matrix representing the membership values µ ij of pixels per cluster; µ ij is the membership value of jth pixel for cluster i; N is the total number of pixels; C is the number of clusters; m is the fuzzy weight, which controls the level of fuzziness and its value lies between 1 and infinity (the m value tending to infinity produces absolutely fuzzy outputs while m tending to unity produces hard or crisp outputs); D x j , v i is the Euclidean distance (d ij 2 ) between the jth pixel value x j and cluster mean v i .
The fuzzy membership, whose value ranges between 0 (low similarity grade) and 1 (high similarity grade), denotes the similarity a data point (pixel) shares with the clus-ter/class. The optimization of the FCM objective function yields Equation (3) for determining the membership function.
The PCM algorithm, developed by Krishnapuram and Keller [18], is a modification of the FCM clustering algorithm. PCM works on the possibilistic theory and thus relaxes the probabilistic constraint of FCM that forces the class memberships of a pixel to be dependent on one another. The objective function of the PCM algorithm is given in Equation (4) subjected to the conditions in Equation (5).
where η i is the "bandwidth" or "resolution" or "scale" parameter [57] that controls the shape and size of the cluster/class. Its value is selected depending on the distribution of pixels in each cluster. The definition of η i in Equation (6), with the value of K being generally chosen to be 1, has been found to work well [57].
The membership function (Equation (7)) is derived by minimizing the objective function of PCM. Unlike in FCM, the membership value here signifies the pixel's possibility of belonging to a cluster or its typicality to a cluster. In other words, the membership value in PCM is absolute while the membership value in FCM is relative.
The interpretation of m in PCM is different from that in FCM. Increasing m values in FCM signifies increased sharing of the pixels among all available clusters, while an increasing m value in PCM signifies an increased possibility of all pixels belonging to a given cluster.

Fuzzy c-Means with Spatial Constraint (FCM-S)
The FCM-S proposed by Ahmed et al. [42] includes a spatial constraint term in the modified FCM objective function that forces the classification of the pixels to be influenced by the pixel values in the immediate neighborhood. The objective function of FCM-S is defined by Equation (8).
where N j is the set of neighbor pixels falling into the window around the pixel j; N R is the cardinality of N j ; a is the parameter that controls the effect from the neighborhood; D(x r , v i ) is the squared distance (d ir 2 ) between the pixel value x r and the cluster mean v i , where x r represents the rth neighbor in the neighboring window of x j .

Fuzzy Local Information c-Means (FLICM)
In the FLICM algorithm devised by Krinidis and Chatzis [44], a new fuzzy factor is introduced as a local (spatial and grey level) similarity measure for noise insensitivity and image detail preservation. The algorithm is free from empirically selected parameters. Its objective function is given in Equation (9).
where G ij is the fuzzy factor (of the jth pixel for the ith class) that uses the spatial distance between the center pixel and the neighboring pixels in the local window to control the influence of the pixels in the neighborhood. G ij is calculated using Equation (10).
ed jr is the spatial Euclidean distance between the centre pixel j and the neighboring pixel r; µ ir is the degree of membership of the rth neighbor pixel (of central pixel j) in cluster i.

Adaptive Fuzzy Local Information c-Means (ADFLICM)
The ADFLICM clustering algorithm [21] utilizes a fuzzy similarity measure derived from the spatial attraction model. Spatial attraction models are used to describe the spatial correlation between pixels in an image. The spatial attraction (SA) between two pixels j and r for class i is described by Equation (11).
The similarity measure of ADFLICM is defined by Equation (12), which is based on the spatial attraction model given in Equation (11).
The objective function of ADFLICM is described in Equation (13).

PCM-Based Local Spatial Information Classification Algorithms
The objective function of the standard PCM algorithm was modified to develop PCM-S, PLICM, and ADPLICM algorithms, similar to the modification of the FCM objective function in the FCM-S, FLICM, and ADFLICM algorithms, respectively. The objective functions were formulated and minimized in a fashion similar to the standard FCM/PCM classification algorithms. The necessary conditions were obtained for the objective functions of each algorithm to be at their local minimal extreme with respect to µ ij .
(1) Possibilistic c-means with spatial constraint (PCM-S) The objective function of the PCM algorithm was modified with the spatial constraint term from the FCM-S algorithm to develop the PCM-S algorithm. The PCM-S objective function is given in Equation (14).
To obtain the membership equation, the objective function had to be minimized with respect to U. As the rows and columns of U are independent of one another, minimizing the objective function with respect to U is equivalent to minimizing the individual objective function in Equation (15) with respect to µ ij , provided that the resulting membership value lies in the interval [0,1] [18].
Differentiating Equation (15) with respect to µ ij and setting it to zero yielded the equation for the membership function given in Equation (16).
(2) Possibilistic local information c-means (PLICM) The PLICM algorithm was formulated by modifying the PCM algorithm with the fuzzy factor from the FLICM algorithm. The objective function of the PLICM algorithm is described in Equation (17).
The equation for fuzzy factor G ij is similar to the FLICM algorithm, which is described by Equation (10). The membership function (Equation (18)) below is obtained for J PLICM (U, V) at its local minima with respect to µ ij .
(3) Adaptive possibilistic local information c-means (ADPLICM) An ADPLICM algorithm was formulated using the similarity measure with spatial attraction model as defined in the case of the ADFLICM algorithm. Spatial attraction (SA jr ) and similarity measure (S jr ) equations for ADPLICM are the same as those of the ADFLICM algorithm, which correspond to Equations (11) and (12). The objective function for the ADPLICM algorithm is described in Equation (19).
The membership function of the ADPLICM algorithm obtained is given in Equation (20).

Methodology
The adopted methodology used in this study is shown in Figure 2. The methodology steps involved pre-processing of satellite images, implementation of classification algorithms, optimization of algorithm parameters, and evaluation of algorithm performances by experimental analysis. All classification algorithms and accuracy assessment methods were implemented using Python 3.7 (libraries used: Rasterio, Numpy, Matplotlib, Math).

Methodology
The adopted methodology used in this study is shown in Figure 2. The methodology steps involved pre-processing of satellite images, implementation of classification algorithms, optimization of algorithm parameters, and evaluation of algorithm performances by experimental analysis. All classification algorithms and accuracy assessment methods were implemented using Python 3.7 (libraries used: Rasterio, Numpy, Matplotlib, Math).

Figure 2.
Steps for the adopted methodology in the study.
The Landsat-8 image used for classification and the Formosat-2 image used for the creation of reference membership fraction images were initially pre-processed. This included geo-referencing of Formosat-2 image and resampling to a 10 m spatial resolution using the Nearest Neighbor method so that the pixel sizes in the Formosat-2 image and the Landsat-8 image were in the ratio of 1:3. Image-to-image registration was subsequently performed using the Formosat-2 image chosen as the master image. The images were then cropped to the study area for further analysis.
The algorithm parameters, fuzzy factor , window size, and (in the case of FCM-S and PCM-S), were initially optimized for the classifiers to exploit their maximum potential. Owing to the disparity in the choice of the optimal value of in the literature [9,28,29], we followed an experimental strategy to estimate the best possible value of , in the interval [1.1,3], for which the highest classification accuracy was obtained. The window size and parameter , in FCM-S and PCM-S, were also optimized following a trialand-error approach. The estimation of the optimal value of in the FCM-S algorithm without prior knowledge of noise was found to be difficult [44]. Larger values of were seen to generate more noise resistant outputs, while smaller values of were seen to generate outputs where image details were better preserved. Because of this, we followed an empirical approach to find the optimal value of as in other studies, where the FCM-S Steps for the adopted methodology in the study.
The Landsat-8 image used for classification and the Formosat-2 image used for the creation of reference membership fraction images were initially pre-processed. This included geo-referencing of Formosat-2 image and resampling to a 10 m spatial resolution using the Nearest Neighbor method so that the pixel sizes in the Formosat-2 image and the Landsat-8 image were in the ratio of 1:3. Image-to-image registration was subsequently performed using the Formosat-2 image chosen as the master image. The images were then cropped to the study area for further analysis.
The algorithm parameters, fuzzy factor m, window size, and a (in the case of FCM-S and PCM-S), were initially optimized for the classifiers to exploit their maximum potential. Owing to the disparity in the choice of the optimal value of m in the literature [9,28,29], we followed an experimental strategy to estimate the best possible value of m, in the interval [1.1,3], for which the highest classification accuracy was obtained. The window size and parameter a, in FCM-S and PCM-S, were also optimized following a trial-and-error approach. The estimation of the optimal value of a in the FCM-S algorithm without prior knowledge of noise was found to be difficult [44]. Larger values of a were seen to generate more noise resistant outputs, while smaller values of a were seen to generate outputs where image details were better preserved. Because of this, we followed an empirical approach to find the optimal value of a as in other studies, where the FCM-S algorithm was studied [21,[42][43][44]. The value of parameter a was chosen by repeated testing in the interval of [0. 2,8] and fixing the value from which maximum output accuracies from FERM and RMSE were obtained.
The flowcharts for the execution of the PCM-S, PLICM, and ADPLICM algorithms are presented in Figure 3. To initialize the PLICM and ADPLICM algorithms, the PCM algorithm was used. The supervised versions of the FCM, PCM, FCM-S, FLICM, and ADFLICM algorithms were also implemented for comparative analysis. The modification of unsupervised to supervised classification involved obtaining class centroids from input training data and estimating class memberships in a single step [17].

Results
The performances of the PCM-S, PLICM, and ADPLICM classifiers were examined and compared with FCM-S, FLICM, ADFLICM, and the standard FCM/PCM classifiers through four experiments. The scenarios executed were the (1) supervised classification with all the identified classes, (2) supervised classification with untrained classes, (3) supervised classification for single class extraction, and (4) supervised classification in the presence of noise. It was observed that the optimal value of the fuzzy factor , for each classifier varied when the number of classes selected for classification varied. Hence different values were used for the same classifier in each of the scenarios executed. The performances of the local spatial information algorithms were examined for supervised classification in the soft (fuzzy) mode. The input image was separately classified using the FCM, PCM, FCM-S, FLICM, ADFLICM, PCM-S, PLICM, and ADPLICM classifiers for the identified six land cover classes. The accuracy matrices of the classifiers were estimated and compared to analyze the performance of the PCM-based local spatial information classifiers with the FCM-based local spatial information classifiers. The soft classification outputs from FCM and PCM were also included in the accuracy assessment to better assess the effect of including local spatial data into these standard classifiers. Furthermore, experiments were performed to evaluate all eight classifiers' outputs in the presence of untrained classes and in the presence of isolated noisy pixels (pixels that are different from their neighboring pixels). The details of these experiments, along with the results, are explained in the next section.
As soft classification outputs represent the proportion of two or more classes in a pixel, conventional accuracy assessment methods, such as the confusion matrix [58] or kappa coefficient, could not be applied. Therefore, the overall accuracy of the fuzzy error matrix (FERM) [59] and RMSE (root mean square error) were used for assessing the accuracy of soft classified outputs generated by each algorithm with reference to the soft reference data. FERM has a layout similar to the conventional error matrix with the exception that it can have non-negative real numbers, which indicate the class proportions in the reference image and classified image, instead of non-negative integer values. The test dataset was carefully prepared to include pure pixels from homogenous regions of each class, isolated pixels, and pixels at the boundaries or edges of the classes.

Results
The performances of the PCM-S, PLICM, and ADPLICM classifiers were examined and compared with FCM-S, FLICM, ADFLICM, and the standard FCM/PCM classifiers through four experiments. The scenarios executed were the (1) supervised classification with all the identified classes, (2) supervised classification with untrained classes, (3) supervised classification for single class extraction, and (4) supervised classification in the presence of noise. It was observed that the optimal value of the fuzzy factor m, for each classifier varied when the number of classes selected for classification varied. Hence different m values were used for the same classifier in each of the scenarios executed.

Experiment 1: Supervised Classification with All the Identified Classes
The sub-pixel land cover fraction outputs derived for all the six classes (Dense Forest, Eucalyptus, Grassland, Riverine Sand, Wheat, and Water) generated by the eight fuzzy classification algorithms are given in Figures 4-9, respectively. The parameter values of all the algorithms were initially optimized by repeat testing and after careful examination of their effect on all the land cover classes (Table 1).             The FERM overall accuracy was used to quantitatively analyze the performance of the algorithms. The overall accuracy was calculated for each algorithm (1) while taking all the random test samples for accuracy assessment, (2) while taking only the homogenous pixels for testing, (3) while taking isolated pixels as test samples, and (4) while taking the edges and boundary pixels as test samples. The overall accuracies for the four different cases discussed above are denoted by OA, OA1, OA2, and OA3 correspondingly ( Table 2 and Figure 10). The overall accuracies, in general, for all the algorithms were seen to be higher (OA1) when test data consisted of pure pixels from homogenous regions and lower (OA3) when test data consisted of mixed pixels, whose exact ground proportion was not known. While including more pure test pixels from homogenous regions could improve the general overall accuracy, the purpose of the study was to compare the algorithm performance for different cases. The overall poor accuracies of algorithms do not have a considerable impact on the conclusion and inferences of this comparative study. The FERM overall accuracy was used to quantitatively analyze the performance of the algorithms. The overall accuracy was calculated for each algorithm (1) while taking all the random test samples for accuracy assessment, (2) while taking only the homogenous pixels for testing, (3) while taking isolated pixels as test samples, and (4) while taking the edges and boundary pixels as test samples. The overall accuracies for the four different cases discussed above are denoted by OA, OA1, OA2, and OA3 correspondingly ( Table 2 and Figure 10). The overall accuracies, in general, for all the algorithms were seen to be higher (OA1) when test data consisted of pure pixels from homogenous regions and lower (OA3) when test data consisted of mixed pixels, whose exact ground proportion was not known. While including more pure test pixels from homogenous regions could improve the general overall accuracy, the purpose of the study was to compare the algorithm performance for different cases. The overall poor accuracies of algorithms do not have a considerable impact on the conclusion and inferences of this comparative study.  The FCM classifier variants were seen to produce higher accuracies in this scenario. The FCM-S, FLICM, and ADFLICM showed greater overall accuracies while all the random test sample pixels were used for accuracy assessment. Although FLICM and FCM-S were effective in removing isolated pixels and their classification performances were slightly higher than that of conventional FCM in homogenous regions (Figures 10 and 11), they produced smoother outputs, which resulted in the loss of image details, such as edges and boundaries ( Figure 12). As inferred by H. Zhang et al. [21] in their study, the AD-FLICM algorithm showed acceptable performance compared to FCM-S and FLICM in terms of handling isolated pixels while simultaneously retaining the image details ( Figure  12). While the overall accuracies of PCM and PCM-based local spatial information classifiers were lower, the PCM classifier was seen to retain the edges and boundary pixels better than the FCM variants (OA3 in Figure 10). The FCM classifier variants were seen to produce higher accuracies in this scenario. The FCM-S, FLICM, and ADFLICM showed greater overall accuracies while all the random test sample pixels were used for accuracy assessment. Although FLICM and FCM-S were effective in removing isolated pixels and their classification performances were slightly higher than that of conventional FCM in homogenous regions (Figures 10 and 11), they produced smoother outputs, which resulted in the loss of image details, such as edges and boundaries ( Figure 12). As inferred by H. Zhang et al. [21] in their study, the ADFLICM algorithm showed acceptable performance compared to FCM-S and FLICM in terms of handling isolated pixels while simultaneously retaining the image details ( Figure 12). While the overall accuracies of PCM and PCM-based local spatial information classifiers were lower, the PCM classifier was seen to retain the edges and boundary pixels better than the FCM variants (OA3 in Figure 10).  The comparison of global RMSE values estimated for fuzzy local spatial information classifiers also implied the superiority of the FCM-based local spatial classifiers over the PCM-based local spatial information classifiers (Table 3 and Figure 13). The PCM-S, PLICM, and ADPLICM generated higher RMSE values, while the FCM-S, FLICM, and ADFLICM generated lower RMSE values.     The comparison of global RMSE values estimated for fuzzy local spatial information classifiers also implied the superiority of the FCM-based local spatial classifiers over the PCM-based local spatial information classifiers (Table 3 and Figure 13). The PCM-S, PLICM, and ADPLICM generated higher RMSE values, while the FCM-S, FLICM, and ADFLICM generated lower RMSE values.   The comparison of global RMSE values estimated for fuzzy local spatial information classifiers also implied the superiority of the FCM-based local spatial classifiers over the PCM-based local spatial information classifiers (Table 3 and Figure 13). The PCM-S, PLICM, and ADPLICM generated higher RMSE values, while the FCM-S, FLICM, and ADFLICM generated lower RMSE values.   The comparison of global RMSE values estimated for fuzzy local spatial information classifiers also implied the superiority of the FCM-based local spatial classifiers over the PCM-based local spatial information classifiers (Table 3 and Figure 13). The PCM-S, PLICM, and ADPLICM generated higher RMSE values, while the FCM-S, FLICM, and ADFLICM generated lower RMSE values.  Global RMSE values for fuzzy local spatial information classifier outputs. Figure 13. Global RMSE values for fuzzy local spatial information classifier outputs.

Experiment 2: Supervised Classification with Untrained Classes
To understand the performance of the fuzzy local spatial information algorithms in the presence of untrained classes, training pixels from a few information classes were excluded from the training data. Three classes 'Riverine Sand', 'Water', and 'Wheat' were defined, and the training pixels for the remaining classes were removed from the training data before classification. This meant that the pixels in the study area from the omitted classes from training data ideally should be considered as noisy pixels by the classifiers. The output fraction images for the three classes mentioned above were generated by each of the eight classifiers FCM, FCM-S, PCM, PCM-S, PLICM, and ADPLICM separately (Figures 14-16). The values chosen for m were 1.5, 1.2, 1.5, and 1.4, respectively, for PCM, PCM-S, PLICM, and ADPLICM algorithms. The remaining parameter values were the same as those given in Table 1.     The global and class-wise RMSE values were estimated for outputs from each classifier (Table 4 and Figure 17). The PCM-based local spatial information classifiers, PCM-S, PLICM, and ADPLICM, exhibited lower RMSE values, which meant that there was less disparity between the output fraction images generated from these classifiers and the corresponding reference fraction images. Although the PLICM classifier was seen to be slightly advantageous over the other two algorithms in terms of global RMSE value, the The global and class-wise RMSE values were estimated for outputs from each classifier (Table 4 and Figure 17). The PCM-based local spatial information classifiers, PCM-S, PLICM, and ADPLICM, exhibited lower RMSE values, which meant that there was less disparity between the output fraction images generated from these classifiers and the corresponding reference fraction images. Although the PLICM classifier was seen to be slightly advantageous over the other two algorithms in terms of global RMSE value, the class-wise RMSE for the Water class and visual interpretation suggested a significant loss in membership values for Water pixels with the PLICM classifier. To better analyze the performance of PCM-based local spatial information classifiers, RMSE calculation was repeated with test pixels from homogenous regions, noisy test pixels, and test pixels from borders and edges of the LULC classes ( Figure 18). The RMSE values obtained in this case affirmed that the loss of image details was maximal in the PLICM classifier among the PCM-based local spatial information classifiers and minimal in the ADPLICM classifier.
The RMSE values with noisy test pixels were, in general, lower for the PCM-based local spatial information classifiers, with PLICM producing the lowest value. The ADPLICM classifier was seen to perform better in retaining the image details among the PCM-based local information classifiers. class-wise RMSE for the Water class and visual interpretation suggested a significant loss in membership values for Water pixels with the PLICM classifier. To better analyze the performance of PCM-based local spatial information classifiers, RMSE calculation was repeated with test pixels from homogenous regions, noisy test pixels, and test pixels from borders and edges of the LULC classes ( Figure 18). The RMSE values obtained in this case affirmed that the loss of image details was maximal in the PLICM classifier among the PCM-based local spatial information classifiers and minimal in the ADPLICM classifier.
The RMSE values with noisy test pixels were, in general, lower for the PCM-based local spatial information classifiers, with PLICM producing the lowest value. The ADPLICM classifier was seen to perform better in retaining the image details among the PCM-based local information classifiers.    class-wise RMSE for the Water class and visual interpretation suggested a significant loss in membership values for Water pixels with the PLICM classifier. To better analyze the performance of PCM-based local spatial information classifiers, RMSE calculation was repeated with test pixels from homogenous regions, noisy test pixels, and test pixels from borders and edges of the LULC classes ( Figure 18). The RMSE values obtained in this case affirmed that the loss of image details was maximal in the PLICM classifier among the PCM-based local spatial information classifiers and minimal in the ADPLICM classifier.
The RMSE values with noisy test pixels were, in general, lower for the PCM-based local spatial information classifiers, with PLICM producing the lowest value. The ADPLICM classifier was seen to perform better in retaining the image details among the PCM-based local information classifiers.

Experiment 3: Supervised Classification for Single Class Extraction
Extracting a single land cover class might often be required in remote sensing applications. The FCM algorithm and its variants cannot be used to extract a single class from satellite images because of the probabilistic membership constraint in Equation (2), which forces the membership values of the pixels in the single output fraction image generated to be one. In contrast, the PCM algorithm proved viable for one-class classification [60]. For the extraction of a single LULC class from the input dataset, PCM, PCM-S, PLICM, and ADPLICM were used in this study. The training data for the 'Wheat' class alone was given as an input to the classifiers, and output fraction images for the standard PCM and the three PCM-based local spatial information classification algorithms were generated ( Figure 19). The RMSE values obtained for outputs of the classifiers are given in the table below (Table 5). Based on the RMSE value, PLICM/ADPLICM was observed to be the better performing classifier in extracting the 'Wheat' class.

Experiment 3: Supervised Classification for Single Class Extraction
Extracting a single land cover class might often be required in remote sensing applications. The FCM algorithm and its variants cannot be used to extract a single class from satellite images because of the probabilistic membership constraint in Equation (2), which forces the membership values of the pixels in the single output fraction image generated to be one. In contrast, the PCM algorithm proved viable for one-class classification [60]. For the extraction of a single LULC class from the input dataset, PCM, PCM-S, PLICM, and ADPLICM were used in this study. The training data for the 'Wheat' class alone was given as an input to the classifiers, and output fraction images for the standard PCM and the three PCM-based local spatial information classification algorithms were generated ( Figure 19). The RMSE values obtained for outputs of the classifiers are given in the table below (Table 5). Based on the RMSE value, PLICM/ADPLICM was observed to be the better performing classifier in extracting the 'Wheat' class.

Experiment 4: Supervised Classification in the Presence of Noise
The local spatial information algorithms were investigated with a noisy image to evaluate their noise tolerance ability. The Formosat-2 image was corrupted with noisy pixels or isolated pixels whose DN values differed substantially from the surrounding pixels. This was accomplished by setting the DN values of random pixels in the image to "255" in all bands, thereby resulting in random noise pixels appearing as 'white dots' in the image (Figure 20). The noisy image was then separately classified using FCM, FCM-S, PCM, and PCM-S classifiers with the training data of three classes ('Riverine Sand', 'Water', and 'Wheat'). FCM-S and PCM-S were chosen among the local spatial information classifiers since the effect from the neighboring pixels for classification could be controlled using the parameter , unlike the other local spatial information classifiers. The value for the parameter was set to 2 in FCM-S and 0.2 in PCM-S. The value of 1.6 was used in all four classifiers.

Experiment 4: Supervised Classification in the Presence of Noise
The local spatial information algorithms were investigated with a noisy image to evaluate their noise tolerance ability. The Formosat-2 image was corrupted with noisy pixels or isolated pixels whose DN values differed substantially from the surrounding pixels. This was accomplished by setting the DN values of random pixels in the image to "255" in all bands, thereby resulting in random noise pixels appearing as 'white dots' in the image ( Figure 20). The noisy image was then separately classified using FCM, FCM-S, PCM, and PCM-S classifiers with the training data of three classes ('Riverine Sand', 'Water', and 'Wheat'). FCM-S and PCM-S were chosen among the local spatial information classifiers since the effect from the neighboring pixels for classification could be controlled using the parameter a, unlike the other local spatial information classifiers. The value for the parameter a was set to 2 in FCM-S and 0.2 in PCM-S. The m value of 1.6 was used in all four classifiers.
The noisy pixels in the input image appeared as isolated pixels in the classification outputs, predominantly in the class proportion images generated for 'Riverine Sand'. The membership fraction images of the class 'Riverine Sand' generated by FCM, FCM-S, PCM, and PCM-S classifiers are shown below (Figure 21). The isolated pixels were most evident in the FCM output, owing to their high membership values in the class 'Riverine Sand' relative to other defined classes. The noise was seen to be less noticeable in the FCM-S classifier output as compared to the FCM output, very minimal in the PCM output, and negligible in the PCM-S output. The noisy pixels in the input image appeared as isolated pixels in the classificatio outputs, predominantly in the class proportion images generated for 'Riverine Sand'. Th membership fraction images of the class 'Riverine Sand' generated by FCM, FCM-S, PCM and PCM-S classifiers are shown below ( Figure 21). The isolated pixels were most eviden in the FCM output, owing to their high membership values in the class 'Riverine Sand relative to other defined classes. The noise was seen to be less noticeable in the FCMclassifier output as compared to the FCM output, very minimal in the PCM output, an negligible in the PCM-S output.  The noisy pixels in the input image appeared as isolated pixels in the classification outputs, predominantly in the class proportion images generated for 'Riverine Sand'. The membership fraction images of the class 'Riverine Sand' generated by FCM, FCM-S, PCM, and PCM-S classifiers are shown below (Figure 21). The isolated pixels were most evident in the FCM output, owing to their high membership values in the class 'Riverine Sand' relative to other defined classes. The noise was seen to be less noticeable in the FCM-S classifier output as compared to the FCM output, very minimal in the PCM output, and negligible in the PCM-S output.

Discussion
While the choice of reference data and accuracy assessment methods hugely influence the analysis and conclusions drawn from a study, a few specific observations of the results and the accuracy matrices obtained during the scenarios executed in our study are described below. The overall accuracy of FERM and the visual interpretation demonstrated that the three FCM-based local spatial information classifiers (FCM-S, FLICM, and ADFLICM) produced more accurate results, followed by the FCM classifier for classifications with all the six LULC classes. On the contrary, the output quality of the FCM spatial variants degraded significantly in the presence of untrained classes. However, the land cover compositions obtained from the PCM-based local spatial information classifiers were in close agreement with the reality/reference image when a few classes were excluded from training. The three PCM-based local spatial information classifiers (PCM-S, PLICM, and ADPLICM) produced similar RMSE values, which were lower than those of the standard PCM classifier. In other words, the effectiveness of the PCM algorithm in the presence of untrained classes and in handling isolated pixels (noise) was seen to have further improved by incorporating it with local spatial information. While PLICM and ADPLICM were seen to be slightly advantageous in creating accurate land cover class compositions, the empirically set parameter in PCM-S allowed control of the neighborhood effect, depending on the noise intensity in the input image. Among the PCM-based local spatial information classifiers, ADPLICM was seen to preserve maximum image details, such as the ADFLICM of the FCM-based local spatial information classifier.
It was not possible to make a general comparison of the performance of the FCMbased local spatial information classifiers with those of PCM-based local spatial information classifiers since the fundamental behaviors of their corresponding base classifiers were different. The fraction outputs in the FCM classifier illustrated how well every image pixel was accommodated among the existing classes, while the fraction images of the classes in the PCM classifier focused on allocating every image pixel to that class. Due to this, the FCM variants performed well when all the classes were defined, but the performance degraded when a few classes were excluded from training. As Foody [44] concluded in his study, the PCM classifier sometimes is more appropriate than the FCM classifier, although the PCM produces less accurate land cover estimates when all classes are defined. Similar results were observed in our study, in which the PCM and the PCM-based local spatial information classifiers produced lower accuracies than the FCM classifier variants when all classes were defined (Experiment 1). However, the performance of the PCM classifier variants was seen to improve in the presence of untrained classes, as the membership values in these classifiers are calculated using the absolute measure instead of the relative measures [17]. The typical membership of the PCM makes it naturally immune to noise, and hence it results in a more accurate representation of reality in ambiguous environments, regardless of the number of training classes available [16,18]. Further, the typicality of PCM membership values was seen to be enhanced when integrated with local spatial information. In other words, the PCM-based local spatial information classifiers were observed to be even more advantageous in cases where generally the conventional PCM classifier performed better than the FCM variants. To summarize, the FCM-based local spatial information classifiers and PCM-based local spatial information classifiers worked well in situations where their respective base classifier performed better, disregarding the obvious smoothing that resulted from integration with spatial contextual information.
The fuzzy local spatial information classifiers investigated in this study were observed to have obvious advantages over the conventional fuzzy classifiers in handling the noisy pixels and isolated pixels that occurred due to spectral confusions. Nevertheless, there was a loss of image details in these local spatial information classifier outputs due to smoothing in the classes with lesser spatial extent, such as the 'Water' class, and also at the edges and boundaries of classes. The new PCM-based local spatial information classifiers showed extreme loss of image details and poor performance among all the classifiers during supervised classifications with training pixels from all the available landcover classes. However, these classifiers were found to be more efficient than the other fuzzy classifiers at handling noise and producing accurate fuzzy classification outputs during supervised classifications in the presence of untrained classes. Overall, the new PCM-based local spatial information classifiers might be the right choice over existing PCM, FCM, or the FCM-based local spatial variants in the presence of untrained classes and for single class extraction.
While this study elucidates the performance of PCM-based local spatial information algorithms and a few of their advantages, some major limitations of the study need to be acknowledged. First, the PCM performance greatly depends on its initialization. FCM, which is said to provide sensible initialization and scale estimate in less contaminated data [57], was used for initializing the PCM algorithm and hence the outputs of PCM and its variants were seen to produce different class proportion images on inclusion/exclusion of training classes for classification. Second, limited approaches have been explored for incorporating local spatial information in the PCM algorithm with measures that use the spatial Euclidean distance between the center pixel and neighboring pixels. This study could be extended to exploit measures such as the correlation among the pixels to control the effect of neighboring pixels on the center pixel. Third, a small study area in India was chosen for investigating the new algorithms, as the area is well-known and hence suitable for making a relative analysis of results from the existing and new algorithms. The evaluation could be extended to other areas with a wider set of LULC classes. Fourth, there is no standard process for accuracy assessment for soft classification. Methods for the creation of the soft reference data and accuracy assessment of the soft classified outputs are still open research areas. We mainly used RMSE for obtaining a comparative judgment and have not performed any statistical analysis tests on the data or results obtained. Performing appropriate statistical significance tests correctly and drawing inferences cautiously using such tests could provide validity of the results [61].

Conclusions
Relevant information processing depends on good representation methods for information and efficient processing algorithms. Over decades, classifiers have been developed to improve the extraction of useful information from remote sensing imagery. In addition to spectral information, ancillary data, such as spatial-contextual information, has been seen to greatly enhance the classifier's performance. The primary aim of this study was to investigate the effect of incorporating local spatial information into the PCM algorithm using simpler methods and to evaluate the benefits of the new PCM-based local spatial information classifiers over the existing FCM-based local spatial information classifiers. The algorithms PCM-S, PLICM, and ADPLICM, were developed for the classification and compared with the existing FCM-S, FLICM, and ADFLICM classifier outputs. To substantiate the advantages of integrating local spatial information for classification, the conventional FCM and PCM classifiers were also included in the assessment. Trials were performed to evaluate and compare the output of all the eight classifiers with all the classes, in the presence of untrained classes, and upon the introduction of isolated pixels. While it was not possible to comment on which classifier consistently gave better classification results, PCM-based local spatial information classifiers were observed to produce higher accuracies in the presence of untrained classes and in the presence of noise. The PLICM and ADPLICM classifiers producing lower global RMSE values, with ADPLICM being better at image detail preservation and PLICM being efficient at handling isolated pixels, generated more accurate class proportions for classifications in the presence of untrained classes and for extracting a single land cover class. At the same time, the adjustable parameter in the PCM-S classifier allowed the control of the classifier's tolerance to input noise. In general, the performance of the PCM classifier was demonstrably improved as the overall RMSE value was seen to have reduced by more than 0.1 by integrating it with spatial information. Likewise, the RMSE values obtained for the three PCM-based local spatial information classifiers in the presence of untrained classes measured around 0.20, whereas for FCM-based local spatial information classifiers, the RMSE values obtained were in the range of 0.35-0.37. Therefore, there is a huge potential for the PCM classifier integrated with local spatial information to generate land cover compositions that are closer to reality and with much clearer management of outliers. In other words, PCM spatial information classifiers could produce better classification results than conventional PCM, FCM, or FCM variants, especially in the absence of an exhaustive set of information classes for training.