Improving Land Cover Classifications with Multiangular Data : MISR Data in Mainland Spain

In this study, we deal with the application of multiangular data from the Multiangle Imaging Spectroradiometer (MISR) sensor for studying the effect of surface anisotropy and directional information on the classification accuracy for different land covers with different rate of disaggregation classes (from four to 35 different classes) from a Mediterranean bioregion in Iberian, Spain. We used various MISR band groups from nadir to blue, green, red, and NIR channels at nadir and off-nadir. The MISR data utilized here were provided by the L1B2T product (275 m spatial resolution) and belonged to two different orbits. We performed 23 classifications with the k-means algorithm to test multiangular data, number of clusters, and iteration effects. Our findings confirmed that the multiangular information, in addition to the multispectral information used as the input of the k-means algorithm, improves the land cover classification accuracy, and this improvement increased with the level of disaggregation. A very large number of clusters produced even better improvements than multiangular data.


Introduction
Remote sensing techniques offer a unique environmental monitoring capability that covers large geographical areas in a cost-efficient manner.This science can capture important information about the Earth´s land, atmosphere, and body of waters.Remote sensing products have been usefully employed in numerous applications, for instance, carbon emission monitoring [1][2][3], land change detection [4], forest monitoring [5,6], natural hazard assessment [7,8], agriculture and water/wetland monitoring [9][10][11][12], climate dynamics [13][14][15][16], and biodiversity studies [17][18][19].One of the applications of remote sensing is to classify images to obtain classified maps.Classification in remote sensing groups the pixels of an image into a set of classes, such that pixels in the same category have similar properties.Most of the image classifications are based on the detection of the spectral response patterns of land cover classes [20].Traditional remote sensing classification depends on distinctive signatures for the land cover categories or types in the band set that is being used.It regards the ability to distinguish these signatures from other spectral response patterns that may be present [21].The application of remote sensing data for land cover and land use mapping, and their changes, is key to many diverse applications that have been mentioned before [22,23].
Land Use and Land Cover (LULC) mapping using satellite or airborne imagery has increased exponentially over the past decades.This growth is due in part to the wide availability of sensors, such as the Moderate Resolution Imaging Spectroradiometer (MODIS), MEdium Resolution Imaging Spectrometer (MERIS), Landsat, Sentinel, Satellite for observation of Earth (SPOT), aerial photography, and drones, that generate satellite imagery of an adequate spatial resolution at different temporal or angular resolutions.It is also due to improved data availability and accessibility of the sources of remote sensing information [22].
Many approaches exist to classify remote sensing data.However, these converge into three principal methods: supervised, unsupervised, and hybrid classification techniques.Supervised classification requires prior knowledge of the ground cover in the study area.There are some examples in which multispectral or hyperspectral data are used as the input of pixel information to train a classification algorithm [24].Some supervised methods are the maximum likelihood, Spectral Angle Mapper [25], Support Vector Machine [26,27], and Decision Trees [28].These methods group pixels around training samples and a classified image has the same training classes.However, in unsupervised classification, an algorithm finds a prespecified amount of statistical clusters in multispectral or hyperspectral space.Usually, these groups are not equivalent to actual classes of land cover; however, this method can be used without having prior knowledge of the ground cover in the study site [29].Some unsupervised techniques are k-means, principal components, and autoencoder [30].Hybrid methods, meanwhile, involve firstly applying an unsupervised classification and then a supervised one [31].An important issue when a classification experiment is developing is the accuracy of the final product.In this sense, some authors affirm that the accuracy of the results obtained by using unsupervised classification is similar to those of the supervised approach [31,32].
Although the most common basis of classification is multispectral and hyperspectral information, sometimes and in some zones to classify or map LULC, there are classes that overlap regarding the mean, median, or standard deviation in a single spectral band at a single date.Thereby, the final accuracy of classification will also decrease due to the overlapping.This problem may be solved by using, in addition to the multispectral or hyperspectral data as the input algorithm, multiangular information.Ever since the first data obtained by satellite in the 1970s, it has been possible to observe the fact that the reflectance of different land covers is dependent on the observation and illumination angles.That dependence implied that the land covers are anisotropic, that is, they did not reflect the same amount of energy in all directions.The magnitude best explained that reflectance is dependent on both the observation and illumination angles and the wavelength of the reflected radiation [33] is the Bidirectional Reflectance Function (BRDF) that, if referred to a Lambertian object, is called the Bidirectional Reflectance Factor (BRF).
The anisotropy that different terrestrial surfaces present was considered as an avenue to explore.For its study, the launch of satellites with sensors onboard was promoted.These sensors have allowed the study of the different directional behaviors when a target reflects solar radiation [34]; prominent examples include the POLarization and Directionality of the Earth's Reflectances (POL-DER) instrument, the multiangle Imaging Spectro-Radiometer (MISR), and the Compact High-Resolution Imaging Spectrometer (CHRIS Proba).The last two can capture images of almost the same pixel from several different angles during the same flight path [35].In this manner, it is easy to find studies where MISR products have been successfully used.Various authors (e.g., [36][37][38][39][40]) have proven their utility for characterizing the vegetal canopy and for mapping vegetation and land covers at medium and large spatial scales.This paper deals with the application of multiangular MISR data to study the effect of surface anisotropy and directional information on the classification accuracy of almost pure pixels of different land covers with a different rate of disaggregation classes in mainland Spain.Some authors have improved the classification accuracy by using multiangular and multispectral remote sensing information [41][42][43][44][45][46].For instance, Abuelgasim et al. [41] classified five different land covers using the Advanced Solid-State Array Spectroradiometer (ASAS) in Voyageurs National Park, Minnesota.These authors used an artificial neural network method and achieved an improvement of 1% in accuracy by using the multiangular and multispectral information (89%) over using only that which is multispectral (88%).Additionally, Hyman and Barnsley [42] attempted to classify three different Amazonian forest land cover types with MISR data, applying both Principal Components Analysis (PCA) and the minimum distance algorithm as classification techniques.These authors demonstrated that data acquisition in off-nadir viewing improved the discrimination and mapping of the land cover types overall using the −21.6 • view angle, for which accuracies improved by 16.3% (from 78.5% to 95.0%) compared with those for the nadir view.Furthermore, Liesenberg et al. [44] performed classification with MISR data to map three types of forest (from close to open vegetation forest) in the Brazilian savanna.These last authors used the maximum likelihood classification technique, and found that multiangular information could help to distinguish between those classes that are homogenous for the multispectral indicators.Mahtab et al. [45] used MISR data from different dates in two different years.Their study area was Madhya Pradesh, India, and they performed PCA and used a parallelepiped classifier.The authors also found that multiangular information, in combination with multispectral (green, red, and NIR) data, resulted in high classification accuracies for all the dates that they used.Likewise, Su et al. [46] were able to classify arid and semiarid dessert grassland classes located in the Chihuahuan semi-desert province, in New Mexico, by using MISR and MODIS multiangular and multispectral data.Using maximum likelihood and the Support Vector Machine (SVM), the authors achieved results that rounded from 45.8% using nadir reflectance as the input of classification, to 76.7% using SVM algorithms and multiangular and multispectral information combined as the input of classification.Many others studies have also shown that MISR multiangular data can be used to obtain vegetation structure information, such as canopy cover and canopy height [39,[47][48][49][50][51][52][53][54].Therefore, MISR multiangular data show great potential to obtain improved low-to-medium resolution regional land cover maps [26,[44][45][46]55], differentiating classes from the structural point of view.
Against this background, our objective was to assess the possible beneficial effect of incorporating multiangle information in the classification of images in many types of land cover in mainland Spain.An improvement in classification accuracy was expected, but we wanted to know how this improvement was dependent on the level of disaggregation and how land covers were disaggregated.We additionally wanted to test the effects of changing parameters of unsupervised methods, and their relationship with different levels of class aggregation.

Study Site
The study area in this research was the intersection of MISR path 201, and mainland Spain.This area starts in MISR Block 56 and ends in Block 61 (See Figure 1).Mainland Spain has a wide variety of natural surfaces and is an example of the Mediterranean zone.

Study Site
The study area in this research was the intersection of MISR path 201, and mainland Spain.This area starts in MISR Block 56 and ends in Block 61 (See Figure 1).Mainland Spain has a wide variety of natural surfaces and is an example of the Mediterranean zone.

MISR and MISR Data
MISR is a sensor boarded on the Earth Observing System (EOS) Terra satellite with a 705 km sun-synchronous Earth orbit.The MISR instrument has nine push cameras capable of viewing the Earth's surface from different angles (nominal and design values), namely 0 • (An), 26.1 • (Af, Aa), 45.6 • (Bf, Ba), 60.0 • (Cf, Ca), and 70.5 • (Df, Da), relative to nadir.In both, (f) corresponds to forward and (a) is relative to afterward, along the direction of the flight.This sensor obtains Earth information in four spectral bands positioned at 446, 558, 672, and 866 nm, with a temporal resolution of two to nine days and a spatial resolution of either 1100 or 275 m [56].
The MISR team provides several elaborated data products with different processing levels.In this study, we used MI1B2T to classify and MIL2ASLS to atmospherically correct one of the MI1B2 images.The first two products are available at a 275 m spatial resolution, and the last one at a 1100 m spatial resolution.The MI1B2T gives radiance values and a scalar element to convert them into the BRF.The MIL2ASLS provides the BRF, The Rahman-Pinty-Verstraete Modified (MRPV) parameters of BRF, the Aerosol Optical Depth (AOD), and some quality-related values [56].
We classified two MISR MI1B2T images.The first one was from orbit 29476 and was obtained on 3 July 2005, and the second was from orbit 45320 and was obtained on 25 July 2008.Both are from MISR product version F03-0024.Image 29476 was atmospherically corrected to study possible improvement.The data belonged to MISR path 201.The image from 2006 was selected because the land cover map used to evaluate improvement was from 2006, while the image from 2008 was chosen to maximize common zones but with the same month.

CORINE Land Cover 2006
The CORINE (Co-ordination of Information on the Environment) land cover (CLC) consists of a set of maps of the European environmental landscape, based on the photographic interpretation of satellite images (SPOT, LANDSAT TM, and MSS).Ancillary data (aerial photographs, topographic and vegetation maps, statistics, and local knowledge) were used to improve the interpretation and the assignments of the territory into the categories of the CLC nomenclature.The smallest surfaces mapped corresponded to 25 hectares, and the minimum mapping width was 100 m [57].General accuracy was 87.1% in CORINE 2000 version [58], and individual class reliability ranged from 12.5% to 100%.The CORINE 2006 version was evaluated by testing changes between 2000 and 2006, and the accuracy of these changes was found to be 87.82%[59].This project presents comparable digital maps of land cover (and partly land use) for each country based on the terminology of 44 categories organized hierarchically in three levels.
In this work, we used the map from Spain, which consisted of a 1:100,000 vectorial layer from 2006.To evaluate the reliability of the MISR classification results, we used the occupancy map of the CORINE Land Cover 2006 three levels of depth, taking as a reference almost 200,000 points included in a buffer of 137.5 m of ratio, to ensure that we obtained pixels with the presence of only one of the CORINE classes.
There were 35 initial reference classes for the maximum disaggregation; we did not consider all of the 44 classes because nine classes had no points.We grouped these categories in different ways to evaluate up to seven levels of differentiation.We started the experiment with the five classes of CORINE level 1.The classification groups are listed in Table 1.

Data Processing and Data Analysis
The MI1B2T, MIL2ASLS, and MIB2GEOP data from the two MISR schemes at the blue, green, red, and NIR band were downloaded from the NASA Langley Distributed Active Archive Center (DAAC).Then, the data from the Space Oblique Mercator (SOM) were re-projected to the Universal Transverse Mercator (UTM) system, the MI1B2T radiances were transformed into BRF values using the scalar factor also provided on that product, and a visual cloud mask was applied to avoid cloud contamination.
Orbit 29476 was subjected to an atmospheric correction based on regression with data from the MIL2ASLS 1.1 km × 1.1 km product.This method has been used by other authors [60,61].The process consists of taking advantage of the atmospheric correction routine performed by the MISR Team to obtain the MIL2ASLS product with data obtained thanks to the AOD estimation.AOD is estimated at a coarser 17.6 km × 17.6 km spacing, so every square is atmospherically corrected independently.This method occasionally produces a quilting effect.Through a regression per every square of 17.6 km between the product MI1B2T aggregated to 1.1 km, and the MIL2ASLS, we performed an atmospheric correction of the BRF values of the MISR bands belonging to the first product (MI1B2T).Those pixels in which the correlation coefficient per square between the two products was less than 0.9 were excluded.Figure 2 displays the masking effect of this method due to the absence of data squares in the MIL2ASLS product.
As we were interested in evaluating multiangular data improvement in classification, we decided to use an unsupervised classification method to allow self-clustering of that data.Additionally, unsupervised methods are more flexible for assessing different parametrizations of the experiments.The k-means algorithm was used to classify the MISR images.The k-means algorithm belongs to the group of unsupervised classifiers since it does not use a previous sample of points by classes to classify; rather, it seeks to group the data in terms of the number of types that are indicated, minimizing intra-group variability, by reducing the distances between each point and the average of its class [20].

Results
We obtained 23 classified maps and 23 × 7 confusion matrices.As can be seen in Figure 2, the general appearance is consistent with original CLC.Other examples are similar because broadly distributed classes are much better classified than others are.  2. The number of well-assigned pixels increased when both the number of k-means classes and the number of bands did (although when eight or nine bands were used, the results differ by cases).On the contrary, the values of OA decreased when the number of classes to differ increased, except for level 6, for which better results were obtained than Levels 4 and 5.There is a significant leap between levels 2 and 3.The allocation of data was done iteratively.Thus, the calculation and reassignment were performed in classes in each step.For this reason, the initial points that are taken as class representatives may have great importance in the final classification, since the initial classes are built around them [20].To avoid such interference, and to evaluate this parameter's importance, the process was performed with a large number of categories, allowing a good representation of searched typologies and a high number of iterations, enabling large data transfers between classes.The primary factors that were modified were the number of categories, the bands to be taken into account, and the number of iterations.Two different images were used, one of which was corrected atmospherically to check if such correction had any effect on the result.
The different band groups used to classify were: (i) the four zenithal, (G-An, B-An, R-An, and NIR-An), i.e., the typical multispectral information; (ii) the zenithal plus the four nearer nadir (G-An, B-An, R-An, NIR-An, R-Bf, R-Af, R-Ba, and R-Bf); (iii) only multiangle information, the nine bands of red (R-An, R-Bf, R-Af, R-Ba, R-Bf, R-Cf, R-Cf, R-Da, and R-Df); and (iv) all available (G-An, B-An, R-An, NIR-An, R-Bf, R-Af, R-Ba, R-Bf, R-Cf, R-Cf, R-Da, and R-Df).We chose to test the effect of taking into account only multispectral information, only multiangle information, and combinations of both with all existing bands, and without taking into account the most external cameras (which are those that have topographical hiding problems but have essential multiangular information).We excluded points without all nine bands of data to avoid missing values of external bands.
We used the CLC map to assess the reliability of the classifications that resulted from this process.
For the assignment of each spectral class provided by the k-means classification algorithm to a type of the CORINE map, we tended to maximize the number of well-graded pixels.Thereby, we assigned the CORINE class with more points.Once the classes had been assigned, we calculated the total number of well-graded pixels.The kappa coefficient was calculated for each of the experiments and levels to obtain a better value that could discriminate the goodness of the classification results.
The kappa coefficient is a good indicator of the goodness of classification since it takes into account the possible random effect in it.If one class has many more elements than another does, the probability of failure is very high.Therefore, this coefficient takes into account the total number of items in each class, both real and classified.
Additionally, we constructed the confusion matrix for the different levels, and thus were able to determine the confusion that occurred between individual classes.A confusion matrix [62] contains information about actual and predicted classifications made by a classification system.The accuracy of the producer is the percentage of cases correctly classified.The omission error is 100 minus the producer's accuracy and indicates the rate of points in a class that have not been successfully assigned.The accuracy of the user is the percentage of points within a category that do not belong to it.Finally, the commission error is 100 minus this value [20].
To analyze the possible effect of the different configurations of the 12 bands used for the present study, the orbit (24976, 45320, and 24976 atmospherically corrected) number of iterations (20, 30, 50, 100, 400, and 1000) and classes used as the input in the k-means algorithm analyses (30,50,200, 400, and 1000) were considered.Variance analysis was performed using, as the numerical variable, first the overall accuracy (OA) and second the kappa coefficient (K) results.In that approach of variance analyses, the factors used for testing were the level of disaggregation, the orbits, the bands, the number of classes, and the number of iterations.
We applied some transformation to the numerical variables, in some cases, for the ANOVA, and also performed the Kruskal-Wallis test (a non-parametrical test of variance) as some analyses of variance did not meet the homoscedasticity assumption of the parametrical test.Finally, we used a post hoc pairwise test (Games-Howell) to determine which pairs of categories involved in our study were statistically different.
In addition to the test of variance for the global approach, we made multiple-regression models (with the same input variables as in the global approach tests) and thereby determined the independent contribution of each explanatory variable to the response variable of the joint contribution resulting from the correlation with other variables, by using the hierarchical partitioning algorithm [63].The hierarchical partitioning tests were performed using R software with the help of the "hier.part"package (version 1.0-4).

Results
We obtained 23 classified maps and 23 × 7 confusion matrices.As can be seen in Figure 2, the general appearance is consistent with original CLC.Other examples are similar because broadly distributed classes are much better classified than others are.
The OA results according to the reclassification and assignment of the k-means classes to the CLC 2006 classes are shown in Table 2.The number of well-assigned pixels increased when both the number of k-means classes and the number of bands did (although when eight or nine bands were used, the results differ by cases).On the contrary, the values of OA decreased when the number of classes to differ increased, except for level 6, for which better results were obtained than Levels 4 and 5.There is a significant leap between levels 2 and 3.
The highest differences between consecutive levels were those found related to Levels 2 and 3. Levels 6 and 7 were the pair with the most variance of all the classifications performed (see Table 2).The pairs made up of Levels 1 and 2, and 4 and 5, are quite similar in all the classifications performed, while for the pair made up of Levels 5 and 6, the differences in the OA were the highest for the orbit 45320 classifications.
It seems remarkable that for Level 1 and Level 2, the percentage of pixels with a correct class assignation was higher than 90% in all cases, with scarce differences between factors.However, when we compared the last result with the more specific classifications, there were more considerable differences between using some bands, the number of classes, and the iterations.Furthermore, there were larger differences between those percentages and the assigned class variable.
Table 3 shows that kappa values decreased considerably concerning all the correctly classified pixels.Additionally, kappa values were positively correlated with the number of classes and number of bands.A large difference was observed in the most disaggregated classification, finding in some cases, very low kappa values.The differences between the kappa indices in Levels 4-5 and 5-6 are similar for all the experiments performed, with the exception of orbit 45320.As Figure 3 suggests, there is a relationship between the total number of pixels belonging to a class and the kappa coefficient value.The classes with a low number of points had low kappa values (in many cases equal to zero).However, some classes had different kappa values than expected, namely classes 244, 312, 333, 411, 422, and 511.It is important to highlight that the classes with average OA percentages higher than 70 were 211, 244, and 312.However, as can be seen in Table 5, other classes had good values of OA and kappa in some experiments (121, 212, 213, 333, and 422).It can also be seen that classes with a low number of CLC points usually have low accuracy values.
From all the confusion matrices (data not shown), it was possible to observe that the 211 category classified many points that were from other categories.Additionally, this confusion was more common in orbit 45320 than in the other one.The categories belonging to forest zones (311 to 334) could be better differentiated from the agricultural ones (211 to 244).However, grasslands (321) could not be differentiated from the agricultural categories in many cases.Urban categories (level 1 1) always showed a very low OA.It is important to highlight that the classes with average OA percentages higher than 70 were 211, 244, and 312.However, as can be seen in Table 5, other classes had good values of OA and kappa in some experiments (121, 212, 213, 333, and 422).It can also be seen that classes with a low number of CLC points usually have low accuracy values.
From all the confusion matrices (data not shown), it was possible to observe that the 211 category classified many points that were from other categories.Additionally, this confusion was more common in orbit 45320 than in the other one.The categories belonging to forest zones (311 to 334) could be better differentiated from the agricultural ones (211 to 244).However, grasslands (321) could not be differentiated from the agricultural categories in many cases.Urban categories (level 1 1) always showed a very low OA.
Moreover, the broad-leaved forest (311) category had poor classification results in general; we observed pixels from the 311 category classified in the same proportion as coniferous forest (312) or sclerophyllous vegetation (323).
The analyses of variance results for the first approach mentioned in the material and methods section are shown in Table 6.These results suggest that the factors band and classes are significant in most cases (p-value < 0.05).The number of iterations was only statistically significant in the highest level of disaggregation, whereas the orbit was significant for this maximum level of disaggregation and Levels 4 and 5. Table 7 shows the global approach variance analyses performed, taking into account as another class the level, in addition to the bands, classes, iterations, and orbits.These results suggest that there are differences in the classification results by band, class, iteration, orbit, and the iteration between bands and levels.The Fisher statistical value (F value) was the highest for the levels' categorical variable, followed by the classes used as an input in the k-means algorithm and bands.-6 show the results of the pairwise post hoc tests, in which we observe the differences between every group that built every factor or categorical variable.
It can be seen that the mean results of OA and kappa differ significantly (p-value < 0.05) between the experiments with nine and 12 bands, and between the experiments with four and 12 bands.Besides the classifications, the accuracy was better when we introduced the 12 bands than when we only introduced the multispectral information.Additionally, it is remarkable that we found lower OA and kappa coefficient values for the four MISR band experiments than for the other combinations of bands (see Figure 4a).Figure 4c shows that there were no statistically significant differences between the orbit classification accuracy for orbit 29476 and orbit 29476 (atmospherically corrected).The classification accuracy results for orbit 45320 were always slightly worse than those for orbit 29476 (see Figure 4c).
In general, better classification results were obtained when the number of classes for the k-means algorithm set was higher (see Figure 5a).Regarding the number of iterations, we inferred that the pairs made up of 30-100 iterations were the only pairs that differed significantly in the OA classification results.However, in the case of kappa coefficients, that pair of groups was also different and in addition to these, were those made up of 400-1000 and 100-1000 iterations.In these last two cases, the kappa coefficients were higher when we classified setting 100 iterations (reaching kappa values of 0.7) (see Figure 5b).
Figure 6 shows that the nine band category for Level 1 achieved the worst OA and kappa values (95% CI for the mean of band level) of all the band categories.For Level 2, the OA values for the nine band group were the worst of all the band groups, but this was not true for the kappa values.For Level 3, the OA nine band group values were similar to those achieved for the four band group category.On the other hand, in Levels 4, 5, 6, and 7, the nine band category OA and kappa values were higher than those for the four band group.Thereby, the only use of multiangular remote sensing information (nine bands) for the classifications yields worse results than the single purpose of multispectral data (four bands).Additionally, for all levels, the values of OA and kappa were higher when we used all available bands in the L1B2T MISR product (G-An, B-An, R-An, NIR-An, R-Bf, R-Af, R-Ba, R-Bf, R-Cf, R-Cf, R-Da y R-Df).Figure 7 shows that adding more bands to four nadir bands (groups 8 and 12) improves the accuracy, with values from 3.7% (eight bands Level 1) to 91.3% (12 bands Level 7).It can be seen how relative improvements increase with Level 9 multiangular red bands; we obtained worse results (negative relative values) than for four nadir bands in Levels 1 and 2, but more desegregated classes  Figure 7 shows that adding more bands to four nadir bands (groups 8 and 12) improves the accuracy, with values from 3.7% (eight bands Level 1) to 91.3% (12 bands Level 7).It can be seen how relative improvements increase with Level 9 multiangular red bands; we obtained worse results (negative relative values) than for four nadir bands in Levels 1 and 2, but more desegregated classes Figure 4b exhibits how in both the OA and kappa results, the values for Levels 1 and 2 were statistically higher than for the other levels.In Levels 3, 4, 5, and 6, the OA and kappa values were similar and different to those in Levels 1 and 2, and they were different to those of Level 7 (overall, those values achieved by the kappa).
Figure 4c shows that there were no statistically significant differences between the orbit classification accuracy for orbit 29476 and orbit 29476 (atmospherically corrected).The classification accuracy results for orbit 45320 were always slightly worse than those for orbit 29476 (see Figure 4c).
In general, better classification results were obtained when the number of classes for the k-means algorithm set was higher (see Figure 5a).Regarding the number of iterations, we inferred that the pairs made up of 30-100 iterations were the only pairs that differed significantly in the OA classification results.However, in the case of kappa coefficients, that pair of groups was also different and in addition to these, were those made up of 400-1000 and 100-1000 iterations.In these last two cases, the kappa coefficients were higher when we classified setting 100 iterations (reaching kappa values of 0.7) (see Figure 5b).
Figure 6 shows that the nine band category for Level 1 achieved the worst OA and kappa values (95% CI for the mean of band level) of all the band categories.For Level 2, the OA values for the nine band group were the worst of all the band groups, but this was not true for the kappa values.For Level 3, the OA nine band group values were similar to those achieved for the four band group category.On the other hand, in Levels 4, 5, 6, and 7, the nine band category OA and kappa values were higher than those for the four band group.Thereby, the only use of multiangular remote sensing information (nine bands) for the classifications yields worse results than the single purpose of multispectral data (four bands).Additionally, for all levels, the values of OA and kappa were higher when we used all available bands in the L1B2T MISR product (G-An, B-An, R-An, NIR-An, R-Bf, R-Af, R-Ba, R-Bf, R-Cf, R-Cf, R-Da y R-Df).
Figure 7 shows that adding more bands to four nadir bands (groups 8 and 12) improves the accuracy, with values from 3.7% (eight bands Level 1) to 91.3% (12 bands Level 7).It can be seen how relative improvements increase with Level 9 multiangular red bands; we obtained worse results (negative relative values) than for four nadir bands in Levels 1 and 2, but more desegregated classes did not occur.Additionally, four external bands improve the accuracy, concerning to eight-band improvements (see large differences between 8 and 12 bands).Figure 8 shows the results of the hierarchical partition test.These results suggest that the level is the variable that presented the highest independent contribution of the explained variables as a percentage of the total explained variance.The variables that followed the level were the classes used by the k-means process and the bands involved in the classification experiment.The orbit is the variable that contributes least to the total OA and kappa variance explained by band, level, orbit, class, and iterations.It is remarkable that class is the second variable that most variations of the reliability classification showed, and relatively, this variable explained almost 26% more variability than the band chosen.In the k-means algorithm, the class explained around 42% more information than the bands and nearly 45% more variability than the iterations also involved in the k-means algorithm.Figure 8 shows the results of the hierarchical partition test.These results suggest that the level is the variable that presented the highest independent contribution of the explained variables as a percentage of the total explained variance.The variables that followed the level were the classes used by the k-means process and the bands involved in the classification experiment.The orbit is the variable that contributes least to the total OA and kappa variance explained by band, level, orbit, class, and iterations.It is remarkable that class is the second variable that most variations of the reliability classification showed, and relatively, this variable explained almost 26% more variability than the band chosen.In the k-means algorithm, the class explained around 42% more information than the bands and nearly 45% more variability than the iterations also involved in the k-means algorithm.
variable that contributes least to the total OA and kappa variance explained by band, level, orbit, class, and iterations.It is remarkable that class is the second variable that most variations of the reliability classification showed, and relatively, this variable explained almost 26% more variability than the band chosen.In the k-means algorithm, the class explained around 42% more information than the bands and nearly 45% more variability than the iterations also involved in the k-means algorithm.

Discussion
Overall, the classification experiments only achieved good results (OA values larger than 0.9) for Level 1 and Level 2. High OA values were observed for the low-level classification; however, kappa values for these levels were medium.Significant differences in the number of points per class explain discrepancies between OA values and kappa values.Some classes with low presence have low OA class values, such that kappa coefficients can be smaller than OA values.Table 5 shows that some classes (121, 212, 213, 333, and 422) have larger values in some classification results than the average value.Therefore, it can be seen that our unsupervised approach is dependent on the frequency of the land covers, probably due to the random selection of initial seeds.These results imply that a supervised method may be able to obtain betters results, even for low-frequency classes.
Furthermore, we must take into account the fact that CLC global accuracy is not applicable to all classes; some of them have medium or low values (133, 141, 222, 241, 332, 333, and 411), for example, [64] found that forest land covers can have medium accuracies.

Discussion
Overall, the classification experiments only achieved good results (OA values larger than 0.9) for Level 1 and Level 2. High OA values were observed for the low-level classification; however, kappa values for these levels were medium.Significant differences in the number of points per class explain discrepancies between OA values and kappa values.Some classes with low presence have low OA class values, such that kappa coefficients can be smaller than OA values.Table 5 shows that some classes (121, 212, 213, 333, and 422) have larger values in some classification results than the average value.Therefore, it can be seen that our unsupervised approach is dependent on the frequency of the land covers, probably due to the random selection of initial seeds.These results imply that a supervised method may be able to obtain betters results, even for low-frequency classes.
Furthermore, we must take into account the fact that CLC global accuracy is not applicable to all classes; some of them have medium or low values (133, 141, 222, 241, 332, 333, and 411), for example, [64] found that forest land covers can have medium accuracies.
Global accuracy results are comparable or better to those obtained by other studies; for example, classification using k-means for all of France with a 1 km spatial scale using SPOT/VEGETATION [65].This example had as points to test the accuracy of several CLCs reclassified by the authors.It should be noted that, at the time of evaluating the results obtained in the present study, the classified images were all of the Iberian Peninsula; this implies that we have to take it into account when we classify the complexity of the Mediterranean zones and the climatic differences that predominate in our study area [66].However, it is difficult to compare the several works that can be found on remote sensing classification experiments because we consider that as well as, or even more important than, the classification method, the spectral and spatial resolution employment to take care of the exercise are also the reference categories or types of structures.For instance, one comparable example is that carried out by [67].The authors obtained OA results from 48% to 96% to classify eight types of rainforest in Africa, using some sensors with very different spatial and spectral resolutions.In this sense, our study also reflects two critical items that influence the final accuracy of classifications.The first one regards the number of classes that differ in the experiment, and the second one is related to how different those classes are.This last issue seems relevant at the time of the decision on whether to group different typologies or not.This last fact is visible in our study, in that we obtained better OA and kappa results in Level 6 than in Levels 4 and 5.
With regards to the effects of using multiangular information, the results were similar to those of the few studies which have also used multiangular information provided by remote sensing technologies.As we expected, adding multiangular data improved the classification accuracy.For instance, Brown de Colstoun et al. [68] found that the use of multiangular information from the POLDER sensor improved the OA by nearly 5% compared to only using multispectral information.In that study, the authors found that in some classes, the increase in the accuracy of the results was even higher in some specific classes, such as wooded savannas, pastures, and urban/built-up areas.
Our results showed a relevant improvement for wooded savannas (agro-forestry in Spain), but not for urban areas or grasslands.In our study area, grasslands are not green during summer, except alpine pastures, such that they can be very similar to agricultural areas.Mahtab et al. [45] also obtained better results when they included multiangular data from the MISR sensor in addition to adding only multispectral information.They found that some classes that did not differ through the nadir MISR cameras information did vary due to multiangular MISR information.Su et al. [46] also used multiangular remote sensing information to classify land covers.They found increased values of OA rounding to 10 and 15 percent when they used multiangular information in addition to the multispectral isolated information.
However, we have observed that when the level of disaggregation of the classes is low, the multiangular information seems to produce noise in the classification results, and that it is better to use only multispectral data than only multiangular data for classification.Moreover, we found that when the level of disaggregation is high in the classification process, it is better to use only multiangular information than to use only multispectral information (see Figures 6 and 7).This result matches the idea that anisotropy can be a source of confusion at the classification time for classes with different structural types in the same class; i.e., this anisotropy may, for instance, change the spectral signature of a forest due to the density, height, amount, and type of leaves [39,69,70].Land cover anisotropy disperses the values of a specific class and allows structural information to be obtained [71] related to the type of cover that it must be differencing or allows the differentiation between homogeneous structures, such as olive fields or agro-forestry areas, as occurs in the present study.
The most remarkable differences in permanently irrigated land cover and water bodies that we could appreciate between two dates were caused by the fact that in the year 2008 (orbit 45320), there were irrigation restrictions and the water level in the reservoirs was very low in our study area.The date of the remote sensing information to classify may be an essential determinant of the ultimate precision of the classification experiments, as the multiangular information and the phenological time in the ecosystems change by season and between years [47].The atmospheric correction made in the present study did not suggest any improvement in results.

Conclusions
This work investigated the benefit of using multiangular remotely sensed information for the classification of land cover across mainland Spain.The most important findings of this study are as follows:

•
Adding multiangular data to spectral data improves the accuracy of land cover differentiation.The improvement increases as the levels of disaggregation increase.

•
More external cameras add multiangular information that can almost double the accuracy improvement obtained without them.

•
Having a very large number of clusters in unsupervised algorithms improves classifications even more than multiangular data.This effect, in addition to an automatic optimization method to label classes, can give an opportunity to achieve good results for unsupervised algorithms.

•
The global accuracy results obtained in the present study were not good enough to accept classifications, with medium kappa values.In some cases, these results reached 90% of OA; however, kappa values are lower than 0.8.It is difficult to obtain good results with only one image, for a large and diverse area.

Figure 1 .
Figure 1.MISR orbits used in this study after cloud masking.(a) Represents orbit 29476, belonging to path 201 and captured on 3 July 2005; (b) Represents orbit 45320, belonging to path 201 and captured on 25 July 2008.

Figure 1 .
Figure 1.MISR orbits used in this study after cloud masking.(a) Represents orbit 29476, belonging to path 201 and captured on 3 July 2005; (b) Represents orbit 45320, belonging to path 201 and captured on 25 July 2008.

Figure 2 .
Figure 2. Example of maps obtained by using orbit 29476 corrected atmospherically, for 1000 classes, 100 iterations, and 12 bands.(a) Classified map with k-means classes; (b) Labeled map after k-means class aggregation; (c) Original CORINE land cover map.

Figure 2 .
Figure 2. Example of maps obtained by using orbit 29476 corrected atmospherically, for 1000 classes, 100 iterations, and 12 bands.(a) Classified map with k-means classes; (b) Labeled map after k-means class aggregation; (c) Original CORINE land cover map.

Figure 3 .
Figure 3.The relationship between the kappa coefficient of each class for Level 7 and the number of points in the class.The horizontal axis is logarithmic.The data are means for the classifications of orbit 29476 with and without atmospheric correction.

Figure 3 .
Figure 3.The relationship between the kappa coefficient of each class for Level 7 and the number of points in the class.The horizontal axis is logarithmic.The data are means for the classifications of orbit 29476 with and without atmospheric correction.

Figure 4 .
Figure 4. Interval plots from the pairwise post hoc test after applying the global approach variance analyses.(a) OA and kappa interval plots by bands; (b) OA and kappa interval plots by level; (c) OA and kappa interval plots by orbit.The circle with a cross inside represents the median values.

Figure 4 .
Figure 4. Interval plots from the pairwise post hoc test after applying the global approach variance analyses.(a) OA and kappa interval plots by bands; (b) OA and kappa interval plots by level; (c) OA and kappa interval plots by orbit.The circle with a cross inside represents the median values.

Figure 5 .
Figure 5. Interval plots from the pairwise post hoc test after applying the global approach variance analyses.(a) OA and kappa interval plots by the number of classes for the k-means; (b) OA and kappa interval plots by the number of iterations for the k-means.The circle with a cross inside represents the median values.

Figure 6 .
Figure 6.Interval plots from the pairwise post hoc test after applying the global approach variance analyses for the iteration between the level and band factors.(a) Results using OA as the numerical variable; (b) Results using kappa as the numerical variable.

Figure 5 . 19 Figure 5 .
Figure 5. Interval plots from the pairwise post hoc test after applying the global approach variance analyses.(a) OA and kappa interval plots by the number of classes for the k-means; (b) OA and kappa interval plots by the number of iterations for the k-means.The circle with a cross inside represents the median values.

Figure 6 .
Figure 6.Interval plots from the pairwise post hoc test after applying the global approach variance analyses for the iteration between the level and band factors.(a) Results using OA as the numerical variable; (b) Results using kappa as the numerical variable.

Figure 6 .
Figure 6.Interval plots from the pairwise post hoc test after applying the global approach variance analyses for the iteration between the level and band factors.(a) Results using OA as the numerical variable; (b) Results using kappa as the numerical variable.
Remote Sens. 2018, 10, x FOR PEER REVIEW 14 of 19 did not occur.Additionally, four external bands improve the accuracy, concerning to eight-band improvements (see large differences between 8 and 12 bands)

Figure 7 .
Figure 7. Line plots of relative percentage improvement concerning only nadir data (four bands) obtained for every disaggregation level.

Figure 7 .
Figure 7. Line plots of relative percentage improvement concerning only nadir data (four bands) obtained for every disaggregation level.

Figure 8 .
Figure 8. Bar plots of the independent contribution of the explained variables as a percentage of total explained variance.(a) Results using OA as the numerical variable; (b) Results using kappa as the numerical variable.

Figure 8 .
Figure 8. Bar plots of the independent contribution of the explained variables as a percentage of total explained variance.(a) Results using OA as the numerical variable; (b) Results using kappa as the numerical variable.

Table 4
summarizes the classification results (kappa and OA).Tables2-4show how kappa values are considerably smaller than OA values and that the results are not good enough for LULC classification.

Table 4 .
Summary of the classification results by level.

Table 6 .
Results of the variance analyses performed by level (OA and k as numerical variables in every case).The asterisk (*) implies significant results with p-value < 0.05.

Table 7 .
Results from the global approach variance analyses performed (OA and k as numerical variables in every case).The asterisk (*) implies significant results with p-value < 0.05.F is the Fisher statistical value.