Land Cover Classification from Multispectral Data Using Computational Intelligence Tools : A Comparative Study

This article discusses how computational intelligence techniques are applied to fuse spectral images into a higher level image of land cover distribution for remote sensing, specifically for satellite image classification. We compare a fuzzy-inference method with two other computational intelligence methods, decision trees and neural networks, using a case study of land cover classification from satellite images. Further, an unsupervised approach based on k-means clustering has been also taken into consideration for comparison. The fuzzy-inference method includes training the classifier with a fuzzy-fusion technique and then performing land cover classification using reinforcement aggregation operators. To assess the robustness of the four methods, a comparative study including three years of land cover maps for the district of Mandimba, Niassa province, Mozambique, was undertaken. Our results show that the fuzzy-fusion method performs similarly to decision trees, achieving reliable classifications; neural networks suffer from overfitting; while k-means clustering constitutes a promising technique to identify land cover types from unknown areas.


Introduction
The main objective of this study is to discuss the suitability of different computational intelligence methods for studying land cover spatiotemporal modifications, mainly for improving land usage and management.This paper is based on a preliminary conference paper [1] where we presented a novel fuzzy image fusion technique and compared it with two other computational intelligence methods, decision trees and neural networks, for fusing images and performing classification of terrains as waterbody, river bank, bare area, cropland, grassland, shrubland and forest.In this article, we extend the previous work by reproducing a spatiotemporal case study with three years (1989,2002,2005) run by Temudo et al. [2] where the changes of land cover usage in the post-war period (>1992) in the district of Mandimba, Niassa province, Mozambique, were studied (http://earthexplorer.usgs.gov).In our article, we also include an unsupervised approach for enriching the comparative study.The aim of this extension is to strengthen the claims about the accuracy of the fuzzy-fusion approach and to demonstrate its suitability for spatiotemporal image fusion.
Nowadays, with the free availability in web repositories of multispectral satellite images from Earth observation missions, we can efficiently classify terrain types and, therefore, contribute to studies and improved analysis of the Earth environment.These studies fall into the domain of remote sensing analysis and cover topics such as deforestation, degradation of coastal areas, wildfires and shifting cultivation, among others [3].The term remote sensing is understood as a technique for identifying, classifying and determining an object's properties through the analysis of data acquired remotely, without physical contact with the object itself [4].
There are several works in the literature about applying Computational Intelligence (CI) techniques to satellite images for feature extraction and classification, with the aim of improving land cover analysis.For example, CI techniques such as fuzzy logic [5][6][7], Decision Trees (DT) [8][9][10] and Artificial Neural Networks (ANN) [11][12][13][14] are already applied in many classification problems and in particular in some cases of remote sensing [15][16][17].
The discussed fuzzy-fusion inference method, instead of just using a basic fuzzy inference model, is improved with reinforcement aggregation operators to perform the inference reasoning [1,18].Reinforcement aggregation operators have been applied in decision making methods and data fusion [19][20][21][22], but here, they are used in the inference scheme to ensure positive or negative reinforcement in the classification.The fuzzy-fusion inference method includes three steps: (i) fuzzification of the spectral information (bands); (ii) creation of the rule set for the land cover classes; (iii) evaluation of the rules using aggregation operators.To create the fuzzy membership functions (Step i), we generated histograms from the training data and then built the membership functions by fitting Gaussian functions to the histogram's clusters (clustered with Otsu's thresholding method [23]) using a hybrid algorithm that combines the Levenberg-Marquardt optimization method [24] and the classic mean and standard deviation approach [16].For the classification process (Step ii), seven rules were defined, one for each class to be classified, having seven inputs each, corresponding to the spectral bands' membership functions.For the inference scheme (Step iii), we followed the Takagi-Sugeno model of fuzzy inputs, crisp output [25] and tested four types of aggregation operators: average; minimum; and two reinforcement operators: Fixed Identity Monotonic Identity Commutative Aggregation (FIMICA); Fuzzy-Fusion (FF)-Uninorm [21,26,27].
This paper is organized as follows.Section 2 provides an overview of techniques used in remote sensing, with special consideration of approaches based on the techniques of computational intelligence.In Section 3, the case study and the proposed classification method are presented.In Section 4, a comparative study of the results obtained with our approach, DT, ANN and unsupervised k-means clustering is given.Finally, in Section 5, we provide some conclusions and future work.

Related Work in Remote Sensing Analysis
Land cover classification from satellite images constitutes a typical problem related to remote sensing.It can be performed in a supervised way (having a preclassified training set) or an unsupervised way (where no additional information regarding the content of the image is provided).This means that the unsupervised algorithm extracts class labels for each image region (or individual pixel), being equivalent to a clustering task.Both classic statistical approaches and others based on Computational Intelligence (CI) techniques have been successfully applied for this clustering task.The classic statistical approach involves algorithms such as k-means [28], c-means [29] or the expectation-maximization method [30].On the other hand, the approaches based on CI frequently employ nature-inspired heuristics like genetic algorithms [31] or particle swarm optimization [32].These CI supervised methods represent an alternative solution to the aforementioned clustering techniques, where the algorithms are supplemented with additional knowledge related to the assignment of pixels/regions to different classes.Among the most common approaches of this type are DT, ANN and fuzzy inference systems.As the paper studies a supervised variant of a remote sensing method and employs the aforementioned techniques for comparison, they will be covered below in more detail.
Decision Tree (DT) classifiers divide data into smaller subsets with some similar features, until subsets are homogeneous.A DT is composed of nodes (root and interior) and leaves (terminal nodes).Each node represents a rule that is applied to the data and that controls the path to be followed to the next node.The design of a tree can be done manually, when it is based on a priori knowledge, or automatically using mathematical evaluation algorithms, such as ID3 (Iterative Dichotomiser 3), C4.5 or CART (Classification And Regression Tree) methods.Friedl and Brodley [3] and Fauvel et al. [17] used DTs to perform terrain classification and concluded that they are adaptable to noisy and nonlinear relations between land cover classes and remotely-sensed data.
Artificial Neural Networks (ANN) form a classification technique where neurons are trained to detect patterns in the training dataset and then the trained classifier is applied to unknown data.In a multi-layer ANN, the number of neurons in the input and output layers is determined by the data being analyzed (external elements), while the number of hidden layers is determined mainly by trial and error, allowing the network to solve more complex problems [15].This classification technique has some drawbacks, in particular a long learning time, lack of transparency on the reasoning process (typically presenting a black-box behavior) and a tendency to produce overfitting.One of the first applications of ANN in remotely-sensed data was the work of Kanellopoulos and Wilkinson [33].
In their article, a set of best practices for applying ANN in remote sensing data was presented.Later, Ayhan and Kansu [15] conducted a study, comparing three multispectral image classification techniques, namely ANN, the maximum-likelihood estimator and fuzzy logic, using images from IKONOS II and Landsat.They concluded that ANN is a robust method, but with the drawback that determining the optimum network structure is a hard and fundamental task to ensure good results.
Finally, Fuzzy Inference Systems (FIS) are rule-based models described by logic operators in rules that establish relationships between fuzzy sets [34].The set of rules, which can be provided either by experts or by creating all possible combinations between input variables [35] are composed by a set of propositions, antecedents (inputs) and consequents (outputs).FIS include three main processes [34]: inputs' fuzzification; fuzzy rules' definition and inference scheme selection to obtain the outputs.The fuzzification process refers to the representation of all input variables on the [0, 1] domain through the use of fuzzy sets.The most common fuzzification processes [34] are performed through: intuition, inference or induction.Induction uses observation data to generate membership functions, and this is the chosen method in this work.Regarding the inference scheme, the most well-known models are those of Mamdani [36] and Takagi-Sugeno [25].The Mamdani model requires defuzzification and is more computationally intensive, while Takagi-Sugeno is more suitable for mathematical analysis because it is computationally efficient and produces crisp outputs.In this work, and due to its efficiency, we use the weighted average of the Takagi-Sugeno model, but substituting the classical algebraic operators by the reinforcement aggregation operator.Throughout the literature, there are several published works where fuzzy logic classifiers were applied in remote sensing problems, but none with our specifications.Some, like [7,15,37], train the classifier (fuzzification) using simple Gaussian membership functions, which are defined only with the data's mean and standard deviation, and use a discrete max-min operator for the inference scheme.This method will be hereafter referred to as the classical method.

Input Data
As mentioned, our case study covers the 1989-2005 period because there is a previous study [2] where the authors studied the population return to one Mozambique district (Mandimba), after the 1992 peace accord, which resulted in observable changes in land usage.Mainly, a change was observed for shifting cultivation, a known agricultural technique, sometimes called slash and burn [38], where deforestation allows new cultivation areas, which are well observed from satellite imagery.
The imagery used was borrowed from Temudo et al. [2], which included Landsat 5 (1989) and 7 (2002 and 2005) high-resolution satellite images (8-bit data), taken over the district of Mandimba, province of Niassa, in Mozambique (Figure 1a).The choice of dataset was due to the existing expert land classification, on the mentioned study, that could act as the ground-truth.Landsat 5 used a Thematic Mapper (TM) multispectral scanning radiometer and Landsat 7 the Enhanced Thematic Mapper Plus (ETM+), which has similar resolution (28.5 m/pixel) and the same spectral bands (seven) plus a panchromatic one (15 m/pixel), and each image covers the same area.The satellite images were acquired in 1989, 2002 and 2005 and cover a region of approximately 16,831 km 2 (4605 × 4500 pixels with a 28.5-m/pixel effective resolution).A mask of the Mandimba district and areas not covered by nebulosity in any of the three images was created (Figure 1b), reducing the total covered area to 3367 km 2 .We used five spectral bands, namely Green-Band 2, Red-Band 3, Near-Infrared (c)-Band 4, Short Wave Infrared 1 (SWIR-1)-Band 5 and SWIR-2-Band 7, together with two vegetation indices, to ensure the distinction of vegetation types: the Normalized Difference Vegetation Index (NDVI) [39] and the Vegetation Index (VI) proposed in [40], respectively depicted by Equations ( 1) and (2).In their original equations, they provide normalized values in the interval [−1, 1], but here, they were rescaled to a normalized 8-bit unsigned scale [0, 255].These vegetation indices have the advantage of being less dependent on illumination and having a good discrimination between different land cover types.They show higher values for vegetation, positive low values for water and bare soils and negative index values for clouds and snow.
The imagery used was borrowed from Temudo et al. [2], which included Landsat 5 (1989) and 7 (2002 and 2005) high-resolution satellite images (8-bit data), taken over the district of Mandimba, province of Niassa, in Mozambique (Figure 1a).The choice of dataset was due to the existing expert land classification, on the mentioned study, that could act as the ground-truth.Landsat 5 used a Thematic Mapper (TM) multispectral scanning radiometer and Landsat 7 the Enhanced Thematic Mapper Plus (ETM+), which has similar resolution (28.5 m/pixel) and the same spectral bands (seven) plus a panchromatic one (15 m/pixel), and each image covers the same area.The satellite images were acquired in 1989, 2002 and 2005 and cover a region of approximately 16,831 km 2 (4605 × 4500 pixels with a 28.5-m/pixel effective resolution).A mask of the Mandimba district and areas not covered by nebulosity in any of the three images was created (Figure 1b), reducing the total covered area to 3367 km 2 .We used five spectral bands, namely Green-Band 2, Red-Band 3, Near-Infrared (c)-Band 4, Short Wave Infrared 1 (SWIR-1)-Band 5 and SWIR-2-Band 7, together with two vegetation indices, to ensure the distinction of vegetation types: the Normalized Difference Vegetation Index (NDVI) [39] and the Vegetation Index (VI) proposed in [40], respectively depicted by Equations ( 1) and (2).In their original equations, they provide normalized values in the interval [−1, 1], but here, they were rescaled to a normalized 8-bit unsigned scale [0, 255].These vegetation indices have the advantage of being less dependent on illumination and having a good discrimination between different land cover types.They show higher values for vegetation, positive low values for water and bare soils and negative index values for clouds and snow.(1) Our case study encompassed a total of seven land cover classes, as described in Table 1.However, we will follow the general analysis procedure of Temudo et al. [2], merging similar classes, reducing its number to five (Table 1, column "merged") to facilitate understanding the comparative results.Our case study encompassed a total of seven land cover classes, as described in Table 1.However, we will follow the general analysis procedure of Temudo et al. [2], merging similar classes, reducing its number to five (Table 1, column "merged") to facilitate understanding the comparative results.

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.Forest and Woodlands ForestW Areas with a tree canopy cover greater than 10% 5 Information 2017, 8, 147 5 of 15

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.

Overview of the Fuzzy-Fusion Uninorm Method
The fuzzy-fusion classifier is a supervised system using an inference scheme with specialized reinforcement aggregation operators.The output of each rule is the result of the aggregation of its antecedents, i.e., each rule represents a class, which evaluates a given pixel using its seven spectral information bands.The classification result is the class that obtained the maximum score.To build a classifier based on the fuzzy-fusion approach, three stages are required: (1) create the training set and build the inputs' membership functions; (2) define the rule-base system; and (3) implement the inference scheme with reinforcement aggregation operators.

Training Set
Since the satellite images for the three years were not calibrated, different supervised training sets for each year were created.Further, as mentioned before, since there is no ground-truth data available, we used the training sets built manually by the experts, Temudo et al. [2], through visual inspection of Landsat and SPOT satellite images and supported by land cover statistics produced by Mozambique National Cartography and Remote Sensing Centre (CENACARTA) (available at http://www.cenacarta.com).They selected a total of 10,840, 9950 and 10,106 sample pixels (seven bands each), representing the seven aforementioned classes, respectively for the years 1989, 2002 and 2005.
After defining the training set, seven intensity histograms (one per band or vegetation index) were defined for each class using the spectral intensity information of the sample pixels.To each histogram, one or more membership functions were assigned according to a fitting process, as illustrated in Figure 2, for Waterbodies and Forests and Woodlands classes.There are several methods to create fuzzy membership functions.A good overview can be found in [34].In our case, we followed an inductive method using histograms and fitted Gaussian functions, which on the one hand, can be generated automatically from the training set and, on the other hand, provide a smooth model of the histogram distribution.Knowing that histograms describe the frequency of pixel values within a class, pixels more likely to appear have higher membership values, while the inverse applies to pixels with lower occurrences.
Since the area covered by one pixel (28.5 m × 28.5 m) may contain more than one land type or, in other words, the class selected can be composed of multiple land or vegetation types (example of Shrublands that are a mixture of bushes and soil), these histograms are not always smooth, unimodal or symmetrical.Bimodalities and asymmetries may occur like the ones illustrated in Figure 2. Fitting just one single Gaussian function did not always converge to the best solution.Therefore, we devised five steps to obtain the pseudo-optimal membership functions: 1. Fit the histogram with one asymmetrical Gaussian membership function; 2. Apply Otsu's thresholding method [23] to the histogram to find the two main clusters; 3. Obtain for each cluster an asymmetrical Gaussian membership function, using the cluster's mean and standard deviation values above and below the mean value; 4. Fit each cluster by an asymmetrical Gaussian membership function; 5. Use the root mean square error to select the membership function that best fits the cluster (choosing the resulting membership function of Step 1, 3 or 4).
For the fitting process, the Levenberg-Marquardt optimization algorithm [24] was used.It solves non-linear least square problems, reducing the sum of square differences between the original data points (histogram) and the model function (membership function), by adjusting the model parameters.Our model uses asymmetrical Gaussian functions.Other topologies such as triangular and trapezoidal were also tested; however, Gaussian functions achieved better fitting for the data.In the case of classes with few samples (bare areas) or in some VI histograms where sparsity occurred due to rounding effects, a final manual adjustment of the Gaussian fitting was required.There are several methods to create fuzzy membership functions.A good overview can be found in [34].In our case, we followed an inductive method using histograms and fitted Gaussian functions, which on the one hand, can be generated automatically from the training set and, on the other hand, provide a smooth model of the histogram distribution.Knowing that histograms describe the frequency of pixel values within a class, pixels more likely to appear have higher membership values, while the inverse applies to pixels with lower occurrences.
Since the area covered by one pixel (28.5 m × 28.5 m) may contain more than one land type or, in other words, the class selected can be composed of multiple land or vegetation types (example of Shrublands that are a mixture of bushes and soil), these histograms are not always smooth, unimodal or symmetrical.Bimodalities and asymmetries may occur like the ones illustrated in Figure 2. Fitting just one single Gaussian function did not always converge to the best solution.Therefore, we devised five steps to obtain the pseudo-optimal membership functions: 1.
Fit the histogram with one asymmetrical Gaussian membership function; 2.
Apply Otsu's thresholding method [23] to the histogram to find the two main clusters; 3.
Obtain for each cluster an asymmetrical Gaussian membership function, using the cluster's mean and standard deviation values above and below the mean value; 4.
Fit each cluster by an asymmetrical Gaussian membership function; 5.
Use the root mean square error to select the membership function that best fits the cluster (choosing the resulting membership function of Step 1, 3 or 4).
For the fitting process, the Levenberg-Marquardt optimization algorithm [24] was used.It solves non-linear least square problems, reducing the sum of square differences between the original data points (histogram) and the model function (membership function), by adjusting the model parameters.Our model uses asymmetrical Gaussian functions.Other topologies such as triangular and trapezoidal were also tested; however, Gaussian functions achieved better fitting for the data.In the case of classes with few samples (bare areas) or in some VI histograms where sparsity occurred due to rounding effects, a final manual adjustment of the Gaussian fitting was required.

Rule-Based System
The rule set defined for our method is composed of seven rules, one for each land cover class.The fuzzy variables are the spectral bands, with a specific membership function for each class.In general, if a class receives high scores through all N bands, then it should be assigned to the pixel.A set of K rules (K-number of classes) is then automatically identified in the fuzzy rule-based inference system.It should be noticed that when classes contain a mixture of land or vegetation types and it is possible to have a more detailed training set, it is better to have more rules for each land type, although the output class is the same.In our case, the rule-set is composed of seven rules: The firing level of every rule (i.e., rule output score) is calculated with a reinforcement aggregation operator, detailed in the next section.This technique also generates a certainty measure for all classes, thus enabling producing certainty distribution maps.

Reinforcement Inference Scheme
As mentioned before, here, we follow an inference scheme that uses reinforcement aggregation operators to perform the fusion of images [1].This fusion process with specialized aggregation operators is based on other work performed by some co-authors [18].Reinforcement aggregation operators penalize results with lower scores (negative reinforcement) and reward high scores (positive reinforcement) allowing one to discard alternative classes with very low scores (details about these operators can be seen in [26,27]).Formally, the discussed inference scheme includes, as the premise, the scores from each band and then the firing level determination (aggregation operation) selection of the best score, i.e., the class identification is as follows: where: • ⊕ j = aggregation operator; It should be noted that performing inference with reinforcement operators is an innovative method to determine a more positive or negative reinforcement of the rule's firing level and respective classification certainty for each class (for more details, see [1]).In the same article, it was found that the Uninorm reinforcement aggregation operator was better for classification of satellite images; hence, it is the one considered in this comparative study.

Assessment and Discussion
In the next sub-sections, we discuss the details of the four methods for: (1) the classifiers' training performances and (2) the results comparison of the four land cover classifications of the district of Mandimba.In the ground-truth study [2], the inputs were preprocessed by a mean filtering to produce more homogenous land cover maps.Then, the authors used a decision tree classifier to generate their land cover maps.The effect of filtering on the land cover maps can be seen in Figure 3 where the original image was compared with the FF-Uninorm method before and after applying a 3 × 3 mean filtering.Figure 3b contains a noisier image than the ground-truth one, and Figure 3c contains the same level of smoothness as Figure 3a.Following this observation, we preprocessed all the images in the same fashion.The Decision Tree (DT) ground-truth classifier was created with the CART algorithm, while the ANN and k-means algorithms were developed using MATLAB Toolboxes.The ANN configuration was a feedforward network composed of seven input neurons, one hidden layer with eight neurons and seven output neurons fully interconnected.The k-means was configured to detect eight clusters, being this extra one the background mask, and then, it was applied to the full image.Its training accuracy was computed by locating in the image the training samples, obtaining their outputs and computing the accuracy scores.Since the ground-truth study [2] aggregates the seven classes into a smaller set of five classes, we merged the classes according to the same merged column in Table 1.

Comparison of Training Results
We compare our fuzzy-fusion model, configured with the Uninorm reinforcement aggregation operator (FF-Uninorm), against other CI techniques, namely with DT (ground-truth study), ANN and k-means (an unsupervised approach).All techniques share the same training sets, previously presented on Section 3.2.1,one for each year.The accuracy was computed as the rate of correct classifications versus the total number of training samples.
The classifier training accuracies are shown in Table 2 for each training set (year).In the "training samples" row are presented the percentage of samples for each class within the training set, which in addition are used to weight the total average calculation.From the results shown, ANN consistently produced more accurate classifications along the three years (95.6%, 91.8% and 92.1%), although less consistent along the several classes, as can be seen in 2002, where only 22.4% accuracy for Shrublands was achieved.On the opposite side, k-means, being an unsupervised technique, in the 1989 subset obtained very reasonable results (78.8%), but decreased the accuracy in the subsequent training sets (63.0% and 53.7%).Fuzzy-fusion with Uninorm and DT obtained good classification accuracies (78.1-88.2%for FF-Uninorm and 85.5-90.2%for DT).It can also be noticed that all techniques including DT and FF-Uninorm had difficulties correctly classifying Shrublands, which could indicate a misclassified training set.The Decision Tree (DT) ground-truth classifier was created with the CART algorithm, while the ANN and k-means algorithms were developed using MATLAB Toolboxes.The ANN configuration was a feedforward network composed of seven input neurons, one hidden layer with eight neurons and seven output neurons fully interconnected.The k-means was configured to detect eight clusters, being this extra one the background mask, and then, it was applied to the full image.Its training accuracy was computed by locating in the image the training samples, obtaining their outputs and computing the accuracy scores.Since the ground-truth study [2] aggregates the seven classes into a smaller set of five classes, we merged the classes according to the same merged column in Table 1.

Comparison of Training Results
We compare our fuzzy-fusion model, configured with the Uninorm reinforcement aggregation operator (FF-Uninorm), against other CI techniques, namely with DT (ground-truth study), ANN and k-means (an unsupervised approach).All techniques share the same training sets, previously presented on Section 3.2.1,one for each year.The accuracy was computed as the rate of correct classifications versus the total number of training samples.
The classifier training accuracies are shown in Table 2 for each training set (year).In the "training samples" row are presented the percentage of samples for each class within the training set, which in addition are used to weight the total average calculation.From the results shown, ANN consistently produced more accurate classifications along the three years (95.6%, 91.8% and 92.1%), although less consistent along the several classes, as can be seen in 2002, where only 22.4% accuracy for Shrublands was achieved.On the opposite side, k-means, being an unsupervised technique, in the 1989 subset obtained very reasonable results (78.8%), but decreased the accuracy in the subsequent training sets (63.0% and 53.7%).Fuzzy-fusion with Uninorm and DT obtained good classification accuracies (78.1-88.2%for FF-Uninorm and 85.5-90.2%for DT).It can also be noticed that all techniques including DT and FF-Uninorm had difficulties correctly classifying Shrublands, which could indicate a misclassified training set.From these training accuracies, it was expected to have a more reliable classification from ANN, but to confirm this, we classified the full image and discuss the results in the following section.

Comparison of Classification Results
The full size image of the Mandimba region was also classified and results compared with the ground-truth.The comparison was made by analyzing the total areas covered by each land cover type, computing the Rand Index clustering agreement validity measure (explained in the next section) and by visual inspection of the land cover classified images.In the study run by Temudo et al. [2], the main objective was to analyze the land cover changes that occurred in the period 1989-2005.The authors found that, mainly observed near the border with Malawi (west), there was a shifting in the use of lands from forest and shrublands to areas devoted to crops.
The land cover area distribution over the three-year study for the four methods can be found in Figure 4, while Figure 5 demonstrates land cover classified images.Analyzing the areas of ANN and k-means, only croplands showed the evolution mentioned by the ground-truth study, and all the other land cover types had irregular behaviors, which does not reflect the reality in this district.In the case of ANN, which in the training obtained the best accuracy results, the irregularity can be justified by a possible overfitting.Concluding, ANN does not adapt to small training sets, while other techniques do.The k-means could identify the main land cover types, as depicted in Figure 5 and in more detail in Figure 6, where a road is clearly detected, and the other main classes were correctly classified.Perhaps more tests with a higher number of clusters can identify more subtypes, which can be merged into the land cover groups previously defined and achieve a higher accuracy.From these training accuracies, it was expected to have a more reliable classification from ANN, but to confirm this, we classified the full image and discuss the results in the following section.

Comparison of Classification Results
The full size image of the Mandimba region was also classified and results compared with the ground-truth.The comparison was made by analyzing the total areas covered by each land cover type, computing the Rand Index clustering agreement validity measure (explained in the next section) and by visual inspection of the land cover classified images.In the study run by Temudo et al. [2], the main objective was to analyze the land cover changes that occurred in the period 1989-2005.The authors found that, mainly observed near the border with Malawi (west), there was a shifting in the use of lands from forest and shrublands to areas devoted to crops.
The land cover area distribution over the three-year study for the four methods can be found in Figure 4, while Figure 5 demonstrates land cover classified images.Analyzing the areas of ANN and k-means, only croplands showed the evolution mentioned by the ground-truth study, and all the other land cover types had irregular behaviors, which does not reflect the reality in this district.In the case of ANN, which in the training obtained the best accuracy results, the irregularity can be justified by a possible overfitting.Concluding, ANN does not adapt to small training sets, while other techniques do.The k-means could identify the main land cover types, as depicted in Figure 5 and in more detail in Figure 6, where a road is clearly detected, and the other main classes were correctly classified.Perhaps more tests with a higher number of clusters can identify more subtypes, which can be merged into the land cover groups previously defined and achieve a higher accuracy.Regarding FF-Uninorm and DT, it clearly depicts the trend presented in the ground-truth study as croplands area increased (4.5% » 21.9% for FF-Uninorm and 5.0% » 17.8% for DT) and forest decreased its area (56.5% » 34.0% for FF-Uninorm and 51.3% » 36.5% for DT), while the Others class remained with approximately the same area (1.2 » 1.1% for FF-Uninorm and 1.0 » 0.9% for DT).When analyzing the areas of grassland and shrubland, the results differ.FF-Uninorm identified a negative trend for grassland and a positive trend for shrubland, while DT obtained approximately stable areas for both types along this time period.These results are difficult to distinguish even by visual inspection (top row of Figure 7), since small bushes can be interleaved with grasslands and only a detailed analysis can identify which classification method was more accurate.To improve their classification a neighboring analysis should be undertaken to detect mixed land cover areas.If we sum the areas of these two classes, their differences are lower and show a slight positive trend.
Another validity measure, especially to assess k-means performance, was the use of the Rand Index (RI) cluster agreement measure.RI is an external clustering validity measure based on pairwise comparison of clustering assignments of points belonging to the same/different clusters (in both of compared clustering solutions) [41].A value of one of RI indicates that both investigated clusterings are identical and zero that they do not agree on any pair of points.In Table 3, the agreements between the four methods for each year and the average along the three-year study can be seen.
The agreement results reinforced the conclusion that FF-Uninorm and DT are the most similar ones among them (0.8 on average).Furthermore, it showed that the k-means clustering technique, although being an unsupervised technique, was quite consistent along the study and obtained relatively high scores when compared with FF-Uninorm and DT (0.73-0.84 with FF-Uninorm and 0.69-0.76with DT).
In summary, our model with the FF-Uninorm operator produced results that are adaptable and consistent with DT (bottom row of Figure 6).k-means was shown to be an adequate technique whenever there is no prior knowledge on the land cover classes.Regarding FF-Uninorm and DT, it clearly depicts the trend presented in the ground-truth study as croplands area increased (4.5% » 21.9% for FF-Uninorm and 5.0% » 17.8% for DT) and forest decreased its area (56.5% » 34.0% for FF-Uninorm and 51.3% » 36.5% for DT), while the Others class remained with approximately the same area (1.2 » 1.1% for FF-Uninorm and 1.0 » 0.9% for DT).When analyzing the areas of grassland and shrubland, the results differ.FF-Uninorm identified a negative trend for grassland and a positive trend for shrubland, while DT obtained approximately stable areas for both types along this time period.These results are difficult to distinguish even by visual inspection (top row of Figure 7), since small bushes can be interleaved with grasslands and only a detailed analysis can identify which classification method was more accurate.To improve their classification a neighboring analysis should be undertaken to detect mixed land cover areas.If we sum the areas of these two classes, their differences are lower and show a slight positive trend.
Another validity measure, especially to assess k-means performance, was the use of the Rand Index (RI) cluster agreement measure.RI is an external clustering validity measure based on pairwise comparison of clustering assignments of points belonging to the same/different clusters (in both of compared clustering solutions) [41].A value of one of RI indicates that both investigated clusterings are identical and zero that they do not agree on any pair of points.In Table 3, the agreements between the four methods for each year and the average along the three-year study can be seen.
The agreement results reinforced the conclusion that FF-Uninorm and DT are the most similar ones among them (0.8 on average).Furthermore, it showed that the k-means clustering technique, although being an unsupervised technique, was quite consistent along the study and obtained relatively high scores when compared with FF-Uninorm and DT (0.73-0.84 with FF-Uninorm and 0.69-0.76with DT).
In summary, our model with the FF-Uninorm operator produced results that are adaptable and consistent with DT (bottom row of Figure 6).k-means was shown to be an adequate technique whenever there is no prior knowledge on the land cover classes.

Conclusions
In this work, we extended previous work where a fuzzy-fusion approach with reinforcement aggregation operators, for land cover classification from multispectral satellite images, was proposed.The main aim of the approach was to fuse spectral information (from a multispectral satellite imagery source or other) to produce land cover maps and compare the preliminary approach with two others from the computational intelligence realm.In this article, we improved the preliminary study with two additions.First, we enriched the comparison assessment by introducing a new example composed of a three-year period of satellite images (taken in the years 1989, 2002 and 2005) enabling comparing the evolution of the land cover area distribution with the results of a study, acting as the ground-truth, run in this region.Second, we included another promising method on our comparison study, the k-means unsupervised clustering technique, to assess its suitability as an alternative classification method.

Conclusions
In this work, we extended previous work where a fuzzy-fusion approach with reinforcement aggregation operators, for land cover classification from multispectral satellite images, was proposed.The main aim of the approach was to fuse spectral information (from a multispectral satellite imagery source or other) to produce land cover maps and compare the preliminary approach with two others from the computational intelligence realm.In this article, we improved the preliminary study with two additions.First, we enriched the comparison assessment by introducing a new example composed of a three-year period of satellite images (taken in the years 1989, 2002 and 2005) enabling comparing the evolution of the land cover area distribution with the results of a study, acting as the ground-truth, run in this region.Second, we included another promising method on our comparison study, the k-means unsupervised clustering technique, to assess its suitability as an alternative classification method.

Conclusions
In this work, we extended previous work where a fuzzy-fusion approach with reinforcement aggregation operators, for land cover classification from multispectral satellite images, was proposed.The main aim of the approach was to fuse spectral information (from a multispectral satellite imagery source or other) to produce land cover maps and compare the preliminary approach with two others from the computational intelligence realm.In this article, we improved the preliminary study with two additions.First, we enriched the comparison assessment by introducing a new example composed of a three-year period of satellite images (taken in the years 1989, 2002 and 2005) enabling comparing the evolution of the land cover area distribution with the results of a study, acting as the ground-truth, run in this region.Second, we included another promising method on our comparison study, the k-means unsupervised clustering technique, to assess its suitability as an alternative classification method.
Our fuzzy-fusion approach improved the training of common fuzzy classifiers by: (a) proposing a hybrid algorithm that performs clustering of the pixel intensity histograms and fits them with Gaussian functions; (b) uses a reinforcement operator (Uninorm) as a novel inference scheme for the rule-based classification.Furthermore, this method can produce classification certainty maps that help to assess the results and detect possible improvements (although this functionality was not used in this study).
When comparing the accuracy of our FF-Uninorm approach with the other three computational intelligence techniques (decision trees, artificial neural networks and k-means), the general accuracy was lower than DT and ANN.However, when applied to the full images, it performed similarly to DT, and we manage to successfully reproduce the ground-truth results.A higher disparity in the classification of grassland and shrubland between FF-Uninorm and DT was noticed, although this might be due to these classes being composed of a mixture of bare and vegetation areas, and these methods are not able to process this spatial distribution just with a single pixel classification.
The introduction of the k-means unsupervised technique in this study gave positive insights on the ability of the method to analyze unknown areas or to prepare training sets for other methods.In most tests, it produced reasonable classifications and obtained a good agreement with FF-Uninorm and DT.
As future directions, we can identify the need for improvements on the training set generation and consequent improvements on the membership function creation, mainly to deal with the small number of samples of less frequent classes.Furthermore, we plan to introduce a pixel neighboring analysis upon the pixel classification, to improve the classification of land cover classes that have a mixture of land types (ex.Shrublands).

Figure 1 .
Figure 1.(a) Landsat 5 image taken in 1989 over the district of Mandimba (RGB-Bands 743) and (b) in black, the mask of the Mandimba district and areas not covered by clouds in the 3 images.

Figure 1 .
Figure 1.(a) Landsat 5 image taken in 1989 over the district of Mandimba (RGB-Bands 743) and (b) in black, the mask of the Mandimba district and areas not covered by clouds in the 3 images.

Figure 2 .
Figure 2. Example of histograms and unimodal and bimodal membership function fits for classes Waterbodies and Forests and Woodlands.

Figure 2 .
Figure 2. Example of histograms and unimodal and bimodal membership function fits for classes Waterbodies and Forests and Woodlands.

Figure 4 .
Figure 4. Land cover areas distribution along the three-year study for the four classification methods: (a) FF-Uninorm; (b) Decision Tree; (c) Artificial Neural Network; (d) k-means clustering.

Figure 4 .
Figure 4. Land cover areas distribution along the three-year study for the four classification methods: (a) FF-Uninorm; (b) Decision Tree; (c) Artificial Neural Network; (d) k-means clustering.

Figure 5 .
Figure 5. Land cover classification results using FF-Uninorm, decision trees, artificial neural networks and k-means clustering.

Figure 6 .
Figure 6.Detail of the land cover classification using FF-Uninorm and k-means, where a road is clearly seen and correctly identified by both methods.

Figure 7 .
Figure 7. (Top row) Detail on the Grassland and Shrubland classification comparison between FF-Uninorm and DT.(Bottom row) Correct and detailed classification obtained by FF-Uninorm and DT.

Figure 6 .
Figure 6.Detail of the land cover classification using FF-Uninorm and k-means, where a road is clearly seen and correctly identified by both methods.

Figure 6 .
Figure 6.Detail of the land cover classification using FF-Uninorm and k-means, where a road is clearly seen and correctly identified by both methods.

Figure 7 .
Figure 7. (Top row) Detail on the Grassland and Shrubland classification comparison between FF-Uninorm and DT.(Bottom row) Correct and detailed classification obtained by FF-Uninorm and DT.

Figure 7 .
Figure 7. (Top row) Detail on the Grassland and Shrubland classification comparison between FF-Uninorm and DT.(Bottom row) Correct and detailed classification obtained by FF-Uninorm and DT.

Table 1 .
Classes used for classification and merged 5-class distribution.

Table 1 .
Classes used for classification and merged 5-class distribution.

Table 1 .
Classes used for classification and merged 5-class distribution.

Table 1 .
Classes used for classification and merged 5-class distribution.

Table 1 .
Classes used for classification and merged 5-class distribution.

Table 1 .
Classes used for classification and merged 5-class distribution.

Table 1 .
Classes used for classification and merged 5-class distribution.

Table 1 .
Classes used for classification and merged 5-class distribution.
If Band 1 is Forest and Woodlands (ForestW) and Band 2 is ForestW . . .and Band 7 is ForestW Then output is ForestW.

Table 2 .
Accuracy of classifying the training set with FF-Uninorm, DT, ANN and k-means.

Table 2 .
Accuracy of classifying the training set with FF-Uninorm, DT, ANN and k-means.

Table 3 .
Rand Index agreement validity measure among FF-Uninorm, DT, ANN and k-means.

Table 3 .
Rand Index agreement validity measure among FF-Uninorm, DT, ANN and k-means.
Figure 5. Land cover classification results using FF-Uninorm, decision trees, artificial neural networks and k-means clustering.