Comparison between Parametric and Non-Parametric Supervised Land Cover Classiﬁcations of Sentinel-2 MSI and Landsat-8 OLI Data

: The present research aims at verifying whether there are signiﬁcant differences between Land Use/Land Cover (LULC) classiﬁcations performed using Landsat 8 Operational Land Imager (OLI) and Sentinel-2 Multispectral Instrument (MSI) data—abbreviated as L8 and S2. To comprehend the degree of accuracy between these classiﬁcations, both L8 and S2 scenes covering the study area located in the Basilicata region (Italy) and acquired within a couple of days in August 2017 were considered. Both images were geometrically and atmospherically corrected and then resampled at 30 m. To identify the ground truth for training and validation, a LULC map and a forest map realized by the Basilicata region were used as references. Then, each point was veriﬁed through photo-interpretation using the orthophoto AGEA 2017 (spatial resolution of 20 cm) as a ground truth image and, only in doubtful cases, a direct GPS ﬁeld survey. MLC and SVM supervised classiﬁcations were applied to both types of images and an error matrix was computed using the same reference points (ground truth) to evaluate the classiﬁcation accuracy of different LULC classes. The contribution of S2 (cid:48) s red-edge bands in improving classiﬁcations was also veriﬁed. Deﬁnitively, ML classiﬁcations show better performance than SVM, and Landsat data provide higher accuracy than Sentinel-2.


Introduction
The Earth's study by using remote sensing technologies has acquired more and more importance in recent decades, and, at present, most of the payloads of the new satellite missions are equipped with dedicated sensors for Earth observation (EO), enabling the scientific community to take advantage of these data sources for a wide range of application fields. The remote sensing images coming from satellite sensors allow us to obtain a global view of the Earth, with the possibility of observing and studying even very remote and inaccessible areas of the globe, where classical investigations would be extremely onerous if not impossible. Currently, the satellite missions for the EO purpose are numerous and diversified, and, therefore, they have review times in the same area remarkably close together, with the possibility of the continuous monitoring of natural and anthropic phenomena [1,2].
Scientific research has considerably expanded the areas of application of remote sensing thanks to the different spectral, geometric, and temporal resolution characteristics of airborne and satellite sensors. The sectors range from morphological ones (realization of digital elevation models-DEM, digital surface models-DSM, topography, etc.) to geological ones (recognition of rock types, soil studies, monitoring of degradation phenomena such as landslides, erosion, desertification, etc.) [3][4][5], from the study of the quality of lakes [6,7] and the dynamism of inland waters [8] to oceanography studies [9]. Applications also concern land consumption (urban sprawl, urban structure and density, detection of Geographies 2023, 3 83 unauthorized constructions, road infrastructures) [10,11], the monitoring and management of natural and anthropogenic disasters (earthquakes, hurricanes, landslides, wildfires, pollution, etc.), biochemical processes [12], and the monitoring of the conservation levels of monumental and archaeological sites [13].
Agriculture and forestry represent historical remote sensing fields, which are still expanding [14,15]. The first concerns various sectors ranging from crop classification and phenological cycles to the presence of water stress, infestations, etc. [16,17]. In the forestry sector, the fields of application are very broad and range from forest species classification to vegetation dynamics, forest recolonization processes, deforestation [18][19][20], the identification of tree biomass, and other forest variables [21,22]. Further areas of investigation in forestry are net primary production [23][24][25], forest plantations [26], fire monitoring and detection [27] and the characteristics of wildfires [28,29], and anthropogenic stress on the environment [30].
One of the application fields of remote sensing, historically the most dated and widely used, concerns the realization of LULC maps. Recently, LULC maps have become increasingly important in various sectors, including sustainable land management, landscape ecology, research on climate change, etc. [31][32][33]. Land Use and Land Cover Change (LULCC) maps help us to better understand the environmental processes related to the water cycle, biochemical cycles, and energy exchange. Moreover, they represent a piece of important information to better understand the biodiversity alterations and the presence of environmental stresses possibly related to natural or anthropic factors. Land cover has also been recognized as an Essential Climate Variable (ECV) and proposed as a Satellite Remote Sensing Essential Biodiversity Variable [34]. Land cover characterization with high thematic and spatial levels (spatial resolution less than or equal to 30 m) allows the Earth's surface monitoring on a comparable scale to that of human activity [35].
In recent years, there has been an explosion in the availability of remote sensing data at medium-high resolution mounted on different platforms, and, at the same time, the methods and techniques for the realization of land cover maps have been refined and expanded. The realization of LULC maps through remote sensing involves a series of phases, including the selection of the remotely sensed data (mission, sensor, spectral bands, etc.), the identification of ground truth for the application of the classification algorithm, and ground surveys for the validation of the method and the possible selection of the algorithm that has the best performance. In the literature, there are many studies on LULC classification techniques through the use of supervised and unsupervised algorithms, per-pixel or object-based [36,37].
A meta-analysis of pixel-based supervised techniques for land cover classification performed by Khatami et al. [38] reveals that the inclusion of ancillary data, textures, and multitemporal images allows for a significant improvement in the accuracy of the classification. However, ancillary data are not always available and the processing of multitemporal data requires greater accuracy in data processing and a much higher investment in terms of time and human resources. Furthermore, as shown in the work of Khatami et al. [38], the most efficient contribution to the improvement of the goodness of the classification is provided by the texture. In recent years, also as a consequence of the effects of climate change, land use has tended to change very rapidly [39], so it is often necessary, for the correct management of natural resources, to provide updates with very close temporal steps. Among other factors, the traditional techniques that involve continuous field surveys turn out to be very onerous [40]. Moreover, the maps of LULC and LULCC made on a global scale or for very large geographical areas (continental and even national), using the data [41,42] of satellites for EO, turn out to be of insufficient precision (considering the very wide geographic scope) and accuracy (as a consequence of the resolution of the minimum mapped unit). The resolution of the pixel appears to be excessively broad in these areas and for these sensors, so the percentage of mixed pixels, i.e., areas that include more land use types, is decidedly considerable [43]. Ultimately, these products are too coarse for analysis on a local and national scale and their reliability has often been questioned [44].
Technologies based on remote sensing, however, represent the most efficient and least expensive way, considering the free availability of images, to monitor the changes that occur on Earth.
The current open availability of medium-high-resolution satellite images allows for obtaining reliable land use maps at the local or regional level and in application fields ranging from urban planning to the agricultural sector and the management of natural resources [45]. Already since 2008, the US Geological Survey (USGS) policy on open data [46] has led to a drastic increase in the use of Landsat images in the scientific community, with an ever-increasing demand for Landsat mission data over the years [39]. The launch in 2013 of the Landsat 8 satellite with OLI and TIRS sensors on board, with better spectral characteristics than those of the previous Landsat missions, has contributed to further increasing the use of these data for obtaining LULC both on a local scale and in much broader areas [47]. On 27 September 2021, the new Landsat 9 satellite was also launched into orbit, with the same optical characteristics as Landsat 8 OLI, which will therefore ensure an increase in the repetitiveness in the same area. The open data policy has also had repercussions for other space agencies. The European Space Agency (ESA) has launched satellites for EO, Sentinel-1, -2, and -3, that provide free satellite images for the scientific community, government agencies, the private sector, etc. [48][49][50]. ESA data are acquired in the microwave (Sentinel-1) and optical (Sentinel-2 and -3) electromagnetic spectrum [49]. The Sentinel-2A satellite was launched in June 2015, while Sentinel-2B was launched on 7 March 2017. Thanks to these twin satellites, a high revisit is ensured (approx. 2-3 days). The images have a swath width of 290 km and provide multispectral images in 13 bands at different spatial resolutions. They include 10 m resolution bands (four bands, three in the visible and one in the near-infrared) and 20 m resolution bands (red-edge and two bands in the short-wave infrared), as well as 60 m bands for atmospheric correction [48].
The studies comparing Sentinel-2 (S2) with Landsat 8 (L8) found that S2 improved the spatial and spectral capabilities in identifying different grassland management techniques [51], in the estimation of tree cover and the leaf area index (LAI) [52], to improve the classification quality of built-up areas [11]. These comparative studies attributed the best performance of the S2 to the inclusion of the three red-edge bands of the EM. These bands are allocated between the red and NIR bands and are characterized by an increase in the sharpness of the reflection on the vegetation [53]. Several studies have highlighted the importance of these bands in LAI estimation [53], in the identification of chlorophyll and nitrogen content in plants [54], and in the discrimination of different crop types by their sensitivity towards the structural diversity of leaves and foliage [55]. Vegetation indices have also been developed, based on the red-edge band, which has proven to be efficient for the early diagnosis of stress symptoms in forest stands, ensuring timely intervention and the protection of forest resources [56]. The inclusion of the red-edge bands in a classification scheme [57] has positively influenced the discrimination capacity of the various land use classes by improving the accuracy of the LULC classification. Previously [58], comparing the accuracy between classifications with the use of Compact Airborne Spectrographic Imager (CASI) data, with sixteen spectral bands, and Landsat TM data, it has been shown that the implementation of the red-edge band classification algorithm leads to an improvement in accuracy.
Numerous classification techniques have been developed over the years; among those that use the pixel as a basic analysis unit, there are unsupervised techniques (e.g., K-means and ISODATA) and supervised techniques (i.e., Maximum Likelihood, Minimum Distance, Artificial Neural Network, Decision Tree, Support Vector Machine, Random Forests) [40,59] and genetic algorithms [60]. Among the widely used non-parametric supervised techniques are machine learning, particularly Random Forests and SVM (Support Vector Machine) [61]. Hybrid classification techniques have also been processed [62], such as semi-supervised and a fusion of supervised and unsupervised learning.
The present study aimed to compare, using two of the most reliable classification algorithms, one parametric (Maximum Likelihood Classifier-MLC) and the other non-parametric (Support Vector Machine-SVM), the accuracy of LULC obtained from the data of the two Landsat 8 OLI and Sentinel-2 satellites. For the two sensors, the bands positioned in the same range of the electromagnetic spectrum were used, and, finally, it was verified whether and to what extent the implementation of the S2 red-edge bands leads to an improvement in the accuracy of the classification.

Study Area
The area in question falls within the central-northern portion of the Basilicata region (Italy), also including the regional capital, and covers an area of~36,000 ha ( Figure 1). It is a large representative area, both in morphological and vegetational terms, of the hilly and mountainous landscape of Basilicata. From the orographic point of view, the area has considerable altimetric variability, with a minimum altitude value of~546 m a.s.l. and a maximum of just over 1700 m a.s.l., with a difference in height, therefore, of over 1000 m. The most widespread altimetric range is the mountainous and high-mountainous one (over 75% of the area in question), with~40% of the area in the range between 800 and 1000 m and~35% over 1000 m. It is the area of the west and southwest of the study site, corresponding to the ridge of the Lucanian Apennines. Approximately a quarter of the entire surface is instead located in a hilly area, especially in the 600-800 m range. To this altimetric variability corresponds the high variability of the slopes, with areas of approx. 20% with slopes greater than 45 • . The slope values are, however, generally high, and the flat or sub-flat surfaces (i < 5 • ) affect approx. 5% of the entire study area. The morphological and therefore climatic variability corresponds to the variability of land uses.
The area is mainly occupied by forest types, which affect over 65% of the entire study site. There are mainly broad-leaved forests (~50% of the entire area) and areas consisting of thermophilous shrubs (~12%), in various successional stages and, therefore, with various levels of density. Coniferous forests occupy an area of~1000 ha, attributable to reforestation mainly with Pinus nigra or Pinus halepensis. Broad-leaved forests are mostly made up of mesophilous oak forests (with a prevalence of Turkey oak) and, at higher altitudes, of beech trees. Less frequently, the thermophilous oak woods with downy oak (Quercus pubescens) prevalent are found in the vegetation belt below the oak forests (Quercus cerris). Finally, along the riverbeds, hygrophilous broad-leaved trees are present, with mainly poplars and willows. The pastures occupy an area of~2800 ha, equal to almost 8% of the entire surface of the study area. Crops mainly consist of non-irrigated arable land, especially wheat. On the other hand, permanent crops affect a small area (~1.3% of the entire area) and are attributable to olive groves and, very secondarily, to vineyards and orchards.

Satellite Data
To compare the classifications, the Landsat 8 OLI and Sentinel-2 images described below were acquired. The L8 image was downloaded from the Earth Explorer portal of the United States Geological Survey (USGS), https://earthexplorer.usgs.gov/ (accessed on 10 November 2022), while the S2 image was acquired, again free of charge, from the ESA Sentinel data hub, https://scihub.copernicus.eu/dhus/#/home (accessed on 10 November 2022). The L8 scene falls into path 32 and row 188, while the Sentinel-2A image falls into tile 33TWE (Figure 2). The Landsat OLI image was acquired on 18 August 2017, while the Sentinel-2 one was obtained on 26 August 2017. The images are therefore temporally very close, although, in the classifications, the perfect synchrony is not so important as to invalidate a comparison [63].

Satellite Data
To compare the classifications, the Landsat 8 OLI and Sentinel-2 images des below were acquired. The L8 image was downloaded from the Earth Explorer po the United States Geological Survey (USGS), https://earthexplorer.usgs.gov/ (access 10 November 2022), while the S2 image was acquired, again free of charge, from th Sentinel data hub, https://scihub.copernicus.eu/dhus/#/home (accessed on 10 Nov 2022). The L8 scene falls into path 32 and row 188, while the Sentinel-2A image fal tile 33TWE (Figure 2). The Landsat OLI image was acquired on 18 August 2017, wh Sentinel-2 one was obtained on 26 August 2017. The images are therefore temporall The images described above were suitably pre-processed: in particular, cloud cover removal and atmospheric correction were carried out on the scenes acquired. To remove the clouds from the Sentinel-2 scene, the IdePix algorithm (version 3.0) was used, contained within ESA's Sentinel Application Platform (SNAP) application for managing the images of the Copernicus mission. For the Landsat 8 OLI images, the Fmask procedure was used [64], which is particularly suitable for processing the cloud cover of Landsat images [65]. The atmospheric correction was subsequently carried out. The Sentinel-2 images have been corrected with the Sen2Cor algorithm available within the SNAP application that allows atmospheric correction starting from the Top-of-Atmosphere (TOA) to obtain Bottom-of-Atmosphere (BOA) images. The model is based on an atmospheric correction algorithm of satellite images based on a radiative transfer model [66] and which also uses reflectance Lambert's law. Essentially, Sen2Cor uses an adapted version of Atmospheric and Topographic Correction (ATCOR), with look-up tables generated through the libRadtran software package for radiative transfer calculations. On the other hand, Landsat OLI images were corrected by applying the 6SV algorithm [67], which is proven to be one of the most efficient algorithms for the correction of the spectral bands of different sensors [68]. The next operation consisted of the co-registration of the Landsat 8 OLI and Sentinel-2 images. Numerous studies have highlighted an erroneous co-registration between the two satellites due to residual geolocation errors on which the Global Land Survey images are based [69] for the georeferencing of Landsat ones. To remedy this issue, the OLI images were co-registered with the Sentinel ones, through the selection of ground control The next operation consisted of the co-registration of the Landsat 8 OLI and Sentinel-2 images. Numerous studies have highlighted an erroneous co-registration between the two satellites due to residual geolocation errors on which the Global Land Survey images are based [69] for the georeferencing of Landsat ones. To remedy this issue, the OLI images were co-registered with the Sentinel ones, through the selection of ground control points (GCP) using the orthophoto of the Basilicata region produced by the Agricultural Supply Agency (AGEA) in 2017, with a spatial resolution of 0.20 m. Subsequently, a further selection of GCPs was made on the orthophoto of the study area to refine the co-registration of the images. Root mean square (RMS) errors in the study area range from 0.09 to 0.12 pixels. Co-registration is a fundamental operation because only accurate spatial correspondence between images allows the execution of correct multi-sensor comparisons [70]. Finally, to allow correct comparability of the images of the two satellites, the 10 m bands of S2 (visible and NIR) and those at 20 m (red-edge bands and bands 11 and 12 of the SWIR) were resampled at 30 m, the resolution of the OLI bands. The S2 bands at 10 m were resampled with the boxcar method, while the S2 bands at 20 m were resampled using an area-weighted average method [71]. For the comparison between the two sensors, the bands that had very similar characteristics in terms of bandwidth and center band were selected.
The bands shown in Table 1 were selected for comparison, with the same classification algorithm, between sensors L8 and S-2. The selected bands are the most similar, between the two sensors, in terms of positioning in the electromagnetic spectrum, evaluated in terms of bandwidth and center-band value. To verify the contribution in improving the accuracy levels of the tested classifiers, the red-edge bands were also implemented for Sentinel-2 to evaluate, with the same classification algorithm, the contribution of these bands (bands 5, 6, and 7), located between the red and the NIR, not present in the Landsat OLI images.

Legend Definition and Selection of Ground Truths
The tested classifiers are of the supervised type and therefore require the implementation of ground truth. To identify the land use categories and then to fine-tune the classes to be considered in testing the various classifiers and the performance of the two different sensors, two maps were used, developed for the Basilicata region: a land use map that uses the Corine Land Cover legend and the regional forest map. Although the first also considers the forest types, the second map, due to the method with which it was made, presents levels of precision and accuracy [72], for the forest uses, which are decidedly higher than the first. The careful analysis of the two maps, therefore, led to the identification of the legend shown in Table 2. The burned areas were not present in the maps used, but a visual analysis of the satellite images in various compositions (especially near-infrared composition) highlighted the presence of burnt areas. The overlapping, in fact, reflected the burned areas of the Anti Incendio Boschivo (AIB) layer of the Basilicata region. Figure 3 confirms the presence of two wildfires that occurred on 9 August 2017 and 15 August 2017, respectively.
For each of the land use classes, several ground truths were selected and used both in the implementation phase of the classification algorithm (training site) and for the validation of the accuracy of the maps obtained (validation). In particular, the selected training sites amounted to 308, distributed almost in percentage according to the extent of the various classes of LULC.
For each of these training sites, whose LULC attribute was extracted from the land use map, the correctness of the attribution was first verified, for the forest types, using the forest map of the Basilicata region. In the extraction of the points from the polygons of the land use map, foresight was taken to select them at a distance of not less than 45 m from the boundary between one land use class and another, in such a way that the pixels of the images from the satellite belonged, as far as possible, to a pure class. Furthermore, the condition of a minimum distance between the points of 45 m meant that, for each pixel, only one ground truth could fall. Subsequently, for each of these points, they were verified by photointerpretation, overlapping the vector file of the training sites on the AGEA orthophoto from 2017 of the Basilicata region (with a resolution of 20 cm). Thus, the procedure allowed for the precise attribution of the training sites to the various land uses. Only for 19 of these points, there was ambiguity in the attribution to the LULC class. To identify the exact class of land use for these points, verification in the field via GPS was necessary. The same procedure was used for the ground truth points used in the validation phase. In this case, 165 control points were randomly selected and subjected to verification for the training sites. For 12 points, there were difficulties in assigning the land use class and therefore the survey in the field was carried out using GPS. It is a well-established practice to choose training sites from pre-existing land use maps [73], but these must be of high quality [74], and it is important, in any case, to verify, although the procedure can be expensive, the congruence of the training with the real land use. The sampling approaches, protocols, and designs for the collection of calibration and validation data have gradually matured over time [2,40,75], with particular emphasis on the type of application that these data are intended for, such as time series analysis [76] or studies on classification techniques [77]. However, the focal point remains the choice of a good number of training sites of excellent quality and a good sampling design, as unbalanced samples can lead to substantial errors [78], especially with spatial data of high geometric resolution. It has even been shown [61], by testing various types of classifiers, that the size and quality of the sample of training sites have a greater impact on the accuracy than the classification algorithms used. Moreover, the quality of the training sites themselves is more important in SVM than their number [79]. An old rule of thumb applies to ML classification according to which the number of training sites must be at least ten times the number of classes. Instead, for SVM classifications and machine learning in general, numerous studies have established that SVM requires, to obtain good results in terms of accuracy, a lower sample size [61,80], although this algorithm also requires good training quality in terms of correct attribution to the various classes [81]. Therefore, as previously mentioned, particular attention was paid to the selection of the ground truth and the correct identification of land use, which proved to be a crucial phase [82] for obtaining good-accuracy LULC maps. The solution adopted to avoid excessive reductions in the classification accuracy due to unbalanced training data, which in any case also negatively affects SVM [83], consisted of the choice of equalized stratified random sampling [63].
Geographies 2023, 3, FOR PEER REVIEW 10 established that SVM requires, to obtain good results in terms of accuracy, a lower sample size [61,80], although this algorithm also requires good training quality in terms of correct attribution to the various classes [81]. Therefore, as previously mentioned, particular attention was paid to the selection of the ground truth and the correct identification of land use, which proved to be a crucial phase [82] for obtaining good-accuracy LULC maps. The solution adopted to avoid excessive reductions in the classification accuracy due to unbalanced training data, which in any case also negatively affects SVM [83], consisted of the choice of equalized stratified random sampling [63].

The Classification Algorithms
Two of the most widely used pixel-based classification algorithms [59], which generally provide better results in terms of accuracy, were tested: Maximum Likelihood Classifier (MLC) [14] and Support Vector Machine (SVM), a machine learning technique [59]. In addition to the careful choice of training sites, the appropriate choice of classification methods is also decisive for obtaining good-quality land use maps [84]. No post-processing procedures (e.g., filtering) were applied to the classifications obtained, to avoid the application of this type of procedure affecting the correctness of the comparison between classifiers [85]. The classifications shown in Table 3 were implemented and compared. The "S2-MLC-with-RE" and "S2-SVM-with-RE" classifications provided for the implementation of the red-edge bands to verify the contribution of the same in improving the accuracy of the classification compared to "S2-MLC-without-RE" and "S2-SVM-without-RE". Furthermore, "OLI-MLC" vs. "OLI-SVM", "S2-MLC-without-RE" vs. "S2-SVMwithout-RE", and "S2-MLC-with-RE" vs. "S2-SVM-with-RE" provided information on the best-performing classification algorithm.
MLC represents one of the most used methods among the parametric classification approaches. In this type of approach, it is assumed that the data are distributed according to a predefined probability model and the parameters of this distribution depend on the input data of the training sites. In MLC, in particular, the unknown pixels are assigned to the specific class using the probability of the contours around the training area using the maximum likelihood approach. MLC requires a sufficient number of ground truths to be able to estimate the mean vector and the population variance/covariance matrix. The pixels are assigned to each class based on the threshold value provided by the user. If the probability value of the class is lower than the threshold value set by the user, the pixels are not classified [86]. In the case study, all the pixels were forced to belong to some class in such a way as not to have unclassified pixels. MLC is considered, among the supervised classifiers of the parametric type, the one that provides the best results [84]. Machine learning algorithms, typically non-parametric, do not make assumptions regarding the distribution of input data and are flexible and robust concerning the non-linear and noisy relationships between input characteristics and land use classes. Ultimately, these algorithms do not make assumptions about the normality of distribution and find more and more space in the remote sensing literature [59,87].
SVM is a widely used algorithm as it has high potential for complex and high-dimensional data, i.e., many predictive variables [61,79,88] and with unbalanced datasets [89,90]. The advantage of SVM is to provide good accuracy even with a rather small number of training sites [80,91]. The SVM algorithm [92] is a supervised machine learning classifier, trained for the optimal identification of the separation hyperplane, by maximizing the boundaries between classes [90]. The identified hyperplanes not only maximize the distance between the classes but are such that they do not include any points between them [61]. The points closest to the hyperplane are called support vectors [93]. For the identification of nonlinearly separable classes, four kernel functions are often used in SVM: linear, polynomial, radial basis function (RBF), and sigmoid. RBF was chosen for this work due to its superiority as demonstrated in several studies (e.g., [14,61]). This function has two user-defined parameters that can affect the accuracy of the classification [94]: cost (C), a value used to adapt to classification errors in the training site dataset, and gamma (g), which influences the shape of the separation hyperplane: where K is the RB kernel, x i x j are two samples (feature vectors in some input space), and g is a free parameter. For a more correct comparison between images from different sources, the default values of the parameters were used in this work; in particular, for g, the default setting is given by the inverse of the number of calculated attributes. The default value of C is equal to 100. Among other aspects, according to Melgani and Bruzzone [95] and Maxwell et al. [63,96], the parameter optimization procedures may not lead to significantly different results in the accuracy of the classifications.
For the accuracy assessment of the classifications, one of the most commonly used approaches in classifications from satellite data was used: the error matrix [75]. The evaluation of accuracy is significant only when it is designed in a transparent and statistically defensible way. The resolution of the error matrix provides evaluation elements on Overall Accuracy (OA), User's Accuracy (UA), Producer's Accuracy (PA), Omission Error (OE), Commission Error (CO), and the Kappa coefficient [75,80,[97][98][99].
An OE occurs when a pixel is assigned to the wrong class, while a CE arises when a pixel of another class is assigned to the class in question. The OE for each class, whose complement is called PA, is calculated as the ratio between the pixels of the main diagonal of the error matrix and the total pixels recognized as belonging to that particular class. Instead, the CE, whose complement is called UA, is defined as the ratio between the pixels on the diagonal and the sum of the row elements corresponding to a certain class. The OA of a confusion matrix is the ratio between the number of correctly classified pixels and the number of all ground truth pixels N t : The Kappa coefficient is a measure of the overall agreement statistic of an error matrix and is recognized as a powerful method of measuring errors and comparing the differences between various error matrices [99]. Its formulation is as follows: where i is the class number, n is the total number of classes, N is the total number of classified values compared to truth values, m i,i is the number of values belonging to the truth class i that has also been classified as class i, G i is the total number of truth values belonging to class i, and C i is the total number of predicted values belonging to class i.

Results
Preliminarily, as suggested by various authors [100], an analysis of the separability of training sites was conducted using the Jeffries-Matusita distance test (J-M). The value of J-M theoretically ranges from 0 to 2.0, and values above 1.8 indicate statistically good separability [101]. The data analysis showed good separability between the different land use categories for both Landsat 8 OLI and Sentinel-2 images, with values higher than 1.8. Only for some categories, the separability was not statistically significant (Table 4).
For both Landsat 8 and Sentinel-2, the most confused categories are arable crops (cereal crops) with natural grasslands. Analyzing the spectral trends (Figure 4c,d), it can be seen that the response of the two land use classes, both for Landsat 8 and Sentinel-2, basically follows the same trend, and the two curves are very slightly different. Among other aspects, analyzing the spectral responses of these two categories ( Figure 5) for the two sensors, it is evident that there is very little difference, especially in the spectral range up to NIR; only for SWIR1 and SWIR2, there is a difference, albeit modest, between the two sensors, with Sentinel-2 reflectance values greater than those of Landsat 8 OLI.  A particularly confused category, both with Landsat 8 and Sentinel-2, is "Permanent crops", which is not discriminable, especially from "Arable land" and, secondarily, "Transitional woodland-shrub" and "Natural grasslands" (Figure 4c,d). Finally, there is confusion, both for L8 and for S2, between the "Broad-leaved forests" and the "Coniferous forests" classes. Analyzing the spectral trends between the two sensors (Figure 4e,f), they are almost similar in the two categories. A greater distinction between conifers and broad-leaved trees is identified in the NIR region. Such differences are almost similar in OLI and MSI, with higher reflectance values in broad-leaved trees than in conifers. In general, however, by comparing the Landsat 8 and Sentinel-2 values for the two forest types ( Figure 5), it can be seen how the trends overlap, with a slight difference in the NIR, where Sentinel shows slightly higher reflectance values than OLI. Finally, it should be noted that in S2, unlike L8, broad-leaved forests are confused not only with conifers but also with the "Transitional woodland-shrub" class ( Figure 4f). The results of all the classifications described above are reported in Figures 6-8.     Analyzing the different land use classes for the different classifications, it is evident that the classifiers that show the least divergent area percentages with regard to the land use map of the Basilicata region (Table 2) are "OLI-MLC" and "OLI-SVM", while the major differences are found with "S2-MLC-without-RE" and "S2-MLC-with-RE". In the latter, the addition of the RE bands did not always lead to an improvement in the percentage differences, such as for "Broad-leaved forest". The SVM classifier conducted on the S2 bands, with and without the RE bands, shows quite small differences in terms of areas. For all classifiers, the greatest percentage differences with respect to the reference map relate to the following two land use classes: "Permanent crops" and "Broad-leaved forest". The "Permanent crops" are not very present in the study area, as they occupy an area of just over 1%, while all the classifiers report areas of around 10%. The minor deviations are provided by the classifiers that use the S2 bands, particularly by "S2-SVM-without-RE", which presents minor differences,~7%, compared to the reference map. Woody crops are often confused with deciduous trees, especially with forests undergoing natural expansion at their edges. The lower density of forest cover leads to confusion with tree crops (orchards, olive groves, and vineyards). Further, the percentage areas of "Broad-leaved forest" are underestimated, as all classifiers exhibit a percentage area around 40%, against~50% of the reference map. Minor deviations, of~6%, are reported by "OLI-SVM" and also the other SVM classifiers are better than the MLC classifications in terms of percentage areas. This is probably due to the well-known minor fragmentation of SVM classifications compared to MLC and to the choice not to apply, for a more correct comparison, any filter to the MLC classifications. Even the "Coniferous forests", although not very present, are generally well discriminated, with deviations of 1-2% in all classifiers, especially those using the S2 bands. "OLI-MLC" also shows minor area differences (~1%) compared to the reference map for the categories "Transitional woodland-shrub" and, above all, "Natural grasslands". In the latter case, the ability to discriminate, compared to the other classifiers, is significant: "OLI-MLC" overestimates by~2%, while "OLI-SVM" underestimates by more than 5%. For this land use type, the situation is analogous with the classifiers using the S2 bands, i.e., smaller deviations of MLC compared to SVM, with slight improvements when also using the RE bands. The "Non-irrigated arable land", which, according to the reference map, occupies an area of~19%, is generally underestimated in the MLC classifications (differences of~5-6%), while the differences with the SVM classifiers are very limited. This result is analogous to that of deciduous forests, i.e., the greater fragmentation of the MLC classifications, which manage to capture differences in cover within areas attributable to other land uses. Finally, all the classifiers show very small deviations for "Burnt areas" (0-0.1%), "Inland waters" (0.2-0.3%), and "Bare rocks" (0-2%). Moreover, the "Artificial surfaces" are generally well distinguished, with a better discriminatory capacity for the S2 bands and the MLC classifiers. The addition of the RE bands leads to practically zero differences compared to the reference map. To test the accuracy of the classifications, the confusion matrix and its synthetic indexes (UA, PA, OA, Kappa coefficient) were used. The analysis of OA and K (Table 5 and Figure 9) has shown that the classification that provides the best results in terms of accuracy is the MLC classification applied on Landsat 8 OLI data, which presents OA = 89%, and K = 0.87. The SVM classification with OLI data has lower accuracy values than MLC, with OA = 0.79 and therefore with a significant deterioration in performance. classifier does not significantly improve the overall accuracy, recording an increase in OA of ~1%. The MLC classifier, therefore, has the best performance with both Landsat and Sentinel-2 bands, with better performance using the Landsat bands than the homologous Sentinel-2 bands. For Sentinel-2, adding red-edge bands to the MLC classifier improves the accuracy by ~5%, testifying to the importance of these bands in the discrimination of certain vegetation types. The total accuracy of the "S2-MLC-with-RE" classifier is very close to that of the MLC classifier with Landsat bands, while showing the latter's performance to be better, with an OA difference of just over 2%. By analyzing the confusion matrices of the various experiments in more detail, it can be seen that in the ML classifications (both with Landsat and Sentinel-2), the classes that are not confused (Producer's Accuracy = 100%) are "Water bodies" (code 51), "Bare rocks" (code 332), and "Burnt areas", which therefore have an excellent capacity for discrimination concerning other land use classes. The "OLI-MLC" classifier also performs well for other land uses (especially for all forest lands and "artificial surfaces"). The greatest confusion is for "Non-irrigated arable land", with a PA = ~65%, confused above all with "Natural grasslands" (code 321) and, secondarily, with "Permanent crops" (code 22). As mentioned, the discrimination capacity of forest lands is very good, with accuracy generally higher than 95%. Only the "Broad-leaved forests" have lower accuracy (~90%) and are confused with the other forest categories, especially with the transitional woodland-shrub (code 324) and, secondarily, with conifers. The natural grasslands (code 321) also show good interpretative capacity (PA = ~97%) and are confused only with the burnt areas. Classifications made with Sentinel-2 data provide lower performance than Landsat 8 and, among these, the best performance is obtained for those that adopt MLC as a classification algorithm. MLC, in fact, with the S-2, without the red-edge bands, has an OA equal to 81.6% (for a K = 0.79), while the homologous one with SVM has accuracy of 72.3%, with a substantial difference in performance efficiency. MLC with Landsat compared to the corresponding classification (using the same bands) has a total accuracy level, compared to Sentinel-2, greater than approximately 8%. The worst results are provided by the SVM classifiers using the Sentinel-2 bands. The addition of the red-edge bands to the SVM classifier does not significantly improve the overall accuracy, recording an increase in OA of~1%.
The MLC classifier, therefore, has the best performance with both Landsat and Sentinel-2 bands, with better performance using the Landsat bands than the homologous Sentinel-2 bands. For Sentinel-2, adding red-edge bands to the MLC classifier improves the accuracy by~5%, testifying to the importance of these bands in the discrimination of certain vegetation types. The total accuracy of the "S2-MLC-with-RE" classifier is very close to that of the MLC classifier with Landsat bands, while showing the latter's performance to be better, with an OA difference of just over 2%. By analyzing the confusion matrices of the various experiments in more detail, it can be seen that in the ML classifications (both with Landsat and Sentinel-2), the classes that are not confused (Producer's Accuracy = 100%) are "Water bodies" (code 51), "Bare rocks" (code 332), and "Burnt areas", which therefore have an excellent capacity for discrimination concerning other land use classes. The "OLI-MLC" classifier also performs well for other land uses (especially for all forest lands and "artificial surfaces"). The greatest confusion is for "Non-irrigated arable land", with a PA =~65%, confused above all with "Natural grasslands" (code 321) and, secondarily, with "Permanent crops" (code 22). As mentioned, the discrimination capacity of forest lands is very good, with accuracy generally higher than 95%. Only the "Broad-leaved forests" have lower accuracy (~90%) and are confused with the other forest categories, especially with the transitional woodland-shrub (code 324) and, secondarily, with conifers. The natural grasslands (code 321) also show good interpretative capacity (PA =~97%) and are confused only with the burnt areas.
The homologous classification conducted using the Sentinel-2 bands ("S2-MLC-without-RE") has, as mentioned, a total accuracy value lower than the previous one of~8%. Moreover, in this case, the land use classes with a discriminatory capacity equal to 100% are the "Water bodies" (code 51), the "Bare rocks" (code 332), and the "Burnt areas" (code 334). "Non-irrigated arable land" (code 211) has the same level of accuracy as the homologous classification with Landsat 8, resulting, together with the "Permanent crops", in the most confused class. The latter, compared to the "OLI-MLC" classifier, has an error of~25% more and are confused above all with "Transitional woodland-shrub", but also with nonirrigated arable land and bare rocks. The classifier also shows a substantial worsening for all forest categories compared to "OLI-MLC". Only the "Broad-leaved forests" have almost comparable values, with differences of~3%, while the other categories recorded substantial worsening. The "Coniferous Forests", in particular, show levels of confusion of~17% more than the previous classifier and are mainly confused with the "Transitional woodland-shrub" (code 324) and with the "Broad-leaved forests". The discrimination capacity of "artificial surfaces" also worsens significantly (reduction in PA of around 8%).
The implementation of the red-edge bands in the MLC classification ("S2-MLC-with-RE") leads to an increase in OA of~5%. Moreover, in this case, the PA equal to 100% concerns bare soils, burnt areas, and water bodies. Compared to the MLC classification without the red-edge bands, the accuracy of all land use classes has improved, for the "Permanent crops" (code 22) in particular. The latter recorded an increase in accuracy equal to~17% and are here confused only with arable land and shrubs, while in "S2-MLC-without", they were confused with many other land use categories. The forest lands have also undergone an improvement in accuracy. Only the "Broad-leaved forests" have worsened slightly (reduction in accuracy of around 3%), confused with the "Transitional woodland-shrub". The discrimination capacity of "Coniferous forests", thanks to the inclusion of red edges, has increased considerably (~17%) and they are confused only with broad-leaved forests and no longer, as in "S2-MLC-without-RE", even with shrubs. The natural grasslands show the same levels of accuracy, but, in this case, they are confused only with arable land and no longer also with bare rocks. The accuracy in identifying the "Transitional woodland-shrub" class has also improved (~3%), but, above all, the classes with which they are confused have been greatly reduced. In general, the introduction of red edges in the MLC classifier improves the OA and the accuracy of each class and significantly reduces the number of classes with which a given land use is confused. The SVM classifications all show worse performance than MLC and, among these, the one with the highest overall accuracy is the one conducted with Landsat data, although it shows a decrease in OA of~14%. Many classes in "OLI-SVM" show a decrease in accuracy compared to "OLI-MLC"-in particular, the natural grasslands (code 321), the bare rocks (code 332), and the transitional woodland-shrub (code 324). Much improved, however, is the discrimination capacity of arable land (accuracy greater than approximately 20% compared to "OLI-MLC"), which is not confused with grasslands. Forest lands also show a better PA, albeit slightly, as they are less confused with conifers. SVM classifiers with Sentinel-2 bands have overall accuracy of just over 70%, nor does the inclusion of the red-edge bands in the classifier much improve the OA.

Discussion
Various studies have compared the performance of non-parametric classifiers, both with each other and in comparison with other parametric classifiers, in many territorial areas and for different types of classification ranging from agricultural to forestry areas to LULC [37,40,102], without reaching an unambiguous consensus on the performance of the various classification procedures. Several studies have compared the performance, of the same classifier, between spectral information deriving from different sensors and, in recent years, more specifically between data from the Landsat mission and the Copernicus Sentinel-2 data. Again, the studies do not reach an unambiguous consensus, attributing, in some cases, better performance to Landsat and, in others, to Sentinel data.
Most likely, this depends on several factors, including the characteristics of the study area, the type and quality of remote sensing data, the choice of LULC classes, and the selection of the related training sites. In the same in-depth review by Lu et al. [103], comparing different parametric and non-parametric classifiers, they conclude that the performance of the different classifiers depends on the datasets used. Some studies, in this regard, have shown how the addition of the Sentinel-2 red-edge bands to the various classifiers improves the discrimination capacity of the different land use classes, particularly of plant physiognomies and, above all, forest species. Some authors [28,104] have identified, with different classifiers, that the performance, using Landsat 8 and Sentinel-2, is very similar when using comparable bands between the two sensors, but that adding red edges to Sentinel data provides better results than Landsat.
Analyses conducted on classifications in Central Europe [55,101] have shown how the implementation of the red edge in classifiers can improve the discrimination capacity of forest stands. Previous studies [57,90] have highlighted the importance of red edges in distinguishing crops from the forest and that the entire dataset of Sentinel-2 bands leads to the discrimination of forest species [105,106] with levels of accuracy always very high, above 80%. Some authors [107] have highlighted, using Landsat 8, the importance of SWIR bands, together with NDVI, in the distinction of crops. The inclusion of NDVI in the classification, while it might be useful using OLI, would be redundant with Sentinel-2, where the radiometric input of the red-edge region is sufficient to improve the accuracy of the classification [28].
Immitzer et al. [17] demonstrated the good ability of Sentinel-2 to separate different land use classes, as well as to classify different forest types. The most important bands were SWIR, red, and NIR. The importance of NIR and SWIR for the classification of forest species has been highlighted among other aspects in other studies [17,101,105]. Immitzer et al. [17] conclude that the NIR band is the most important for the separation between conifers and broad-leaved forests, while the SWIR bands are more important for the identification of different forest species. Some authors [105] have identified that the narrower NIR band (8A), in Sentinel-2, is more important in the classification of the wider NIR band (band 8), and have highlighted the importance of red-edge bands in improving the classification accuracy.
Griffiths et al. [15], for the mapping of agricultural areas in Germany, using Landsat OLI and Sentinel-2 multitemporal data, obtained good results, with accuracy greater than 80%, demonstrating how the inclusion in the Random Forest (RF) classification processes of the red-edge bands lead to a slight improvement in overall performance, especially for classes concerning crops. For forests, numerous studies [78,108] have identified the importance of the visible and near-infrared bands in forest classification. Further studies [101] have highlighted the importance of the red-edge bands in the classification of various forest types. These studies highlight the remarkable potential of red-edge bands in forest classification compared to past multispectral sensors without these bands [109,110]. Of the red-edge bands, band 5 (705 nm) is important to distinguish various types of herbaceous crops [111], band 6 to better discriminate natural grasslands from herbaceous crops, and finally band 7 (783 nm) to better separate forest species. Korhonen et al. [52] identified that the red-edge bands, together with the SWIR bands, are the most significant for the discrimination of forest types in the boreal area. Similar results were also obtained by Persson et al. [105]. The importance of the SWIR bands (centered at 1610 nm and 2190 nm, band 11 and band 12, respectively) could be due to their ability to detect the variability of the water content of the different forest species and, therefore, to help discriminate them better [112]. As for the types of classifiers, also in this case, as mentioned, there is no agreement on which ones, with the same input data, provide better performance. SVM can provide, according to some authors [59], by its high ability to generalize complex characteristics, generally better results than parametric classifiers. Goodin et al. [113], using Landsat 8 data, identified overall accuracy of 88%, while Mansaray et al. [114] recorded, with SVM using Landsat 8 and Sentinel-1 and Sentinel-2, accuracy of 93.4%. Khatami et al. [38] found that SVM is the most efficient machine learning algorithm in many applications, outperforming RF, neural networks, and decision trees. Abdi [82] also identified, by comparing four non-parametric algorithms, the best performance of SVM compared to other machine learning classifiers. According to Adam et al. [90] and Li et al. [115], machine learning algorithms (SVM and RF in particular) provide comparable results in terms of accuracy, while Zhang and Xie [116] and Maxwell et al. [96,117] found SVM to perform better than RF. Gong et al. [118] and Yu et al. [119], comparing different parametric and non-parametric classifiers, come to the same conclusion, namely the better accuracy of the SVM classifications.
Furthermore, this classifier provides better performance through a pixel-based approach rather than an object-based one. The conclusions reached by other authors are the opposite: Nivedita Priyadarshini et al. [120], testing Sentinel-2 using different unsupervised and supervised classification algorithms (including MLC and SVM), have reported that MLC shows better performance in terms of accuracy, obtaining OA = 89.3% and OA = 75.6% for MLC and SVM, respectively. Nezhad M. et al. [121] also identified MLC's performance is always higher than that of SVM in the classification of urban and periurban areas of the city of Rome using Sentinel-2 multitemporal images.
The present study always shows better results for MLC classifications than SVM, with differences, using Landsat 8 OLI images, of~11%. Using the Sentinel-2 bands, without the red edges, the difference is~10%. If we also input the red edges, the difference between MLC and SVM increases considerably (the MLC is better than SVM by~15%), because, while, in MLC, the implementation of the red edge improves the accuracy, in SVM, it even worsens it, albeit slightly. The difference in accuracy between ML and SVM could also be because SVM does not use all the data from the training sample but only a subset that defines the boundary or margin conditions (support vector). This would suggest [63] that, in SVM, the strategy for collecting training sites should focus not on typical pixels but on pixels that could be spectrally confused with other classes. Furthermore, the quality of training sites plays a fundamental role in ML classifications-more so than their numberwhich, paradoxically, could lead to the well-known Hughes effect [122], i.e., a reduction in accuracy with increasing sample dimensionality.
The training sites were chosen, in the study site, with an almost balanced criterion, to take into account the percentage distribution of the different land use classes, and selected in such a way as to select pure pixels, and this could have influenced the greater accuracy of the MLC classifications versus the corresponding SVM ones. Finally, it must be said that the choice not to optimize the parameters of the tested classifiers, without, therefore, the iterative search for the best value of the parameters, could have affected the accuracy of the classification. The choice, however, was imposed by the need for a fair comparison between classifiers without introducing a priori knowledge [82]. In general, it can be said that the most confused classes, both among the classifiers taken into consideration and considering the data of the two different sensors, are between non-irrigated arable land (code 211) and natural grasslands (code 321), as already identified in previous studies [17]. Probably, the choice of a mono-temporal image is not able to fully distinguish between the various land use classes, particularly between crops and between these and natural grasslands. A careful choice that takes into account the phenology of the prevalent herbaceous crops could allow better discrimination of these with pastures. It is also possible that the optimal choice, depending on the land use classes present in the study area, of a single image, could allow for obtaining results very similar to multitemporal images throughout the year [123]. In the case in question, the images in August coincide with the wheat harvest, which has already taken place, whose plant residues can be radiometrically confused with the natural herbaceous vegetation. In this case, the choice of an image of June (or the beginning of July), when the wheat growth is at its maximum stage, would probably be able to better discriminate the two land use classes. Similarly, images acquired in spring would probably better differentiate forest species [55,105].
The image acquired in August, while allowing a very satisfactory distinction between forest classes (especially with ML), presents some confusion between conifers and broad-leaved trees. Probably, a spring image, when the broad-leaved trees are not in full growth, would allow a better distinction between the two forest categories [124]. Furthermore, topographic correction would also improve the separability between conifers and deciduous trees, allowing us to reduce the confusion between the deciduous trees on the slopes in shade with the conifers. Topographic correction can represent an important phase of pre-processing, especially in mountainous regions [84,[125][126][127]. The inclusion of red edges with Sentinel-2 in the ML classifier improves the capacity for the discernment of forest categories, particularly of conifers and shrublands [52,57,105]. Forkuor et al. [111] even go so far as to conclude that the red-edge bands are the most important for LULC, providing very similar levels of accuracy to the entire Sentinel-2 dataset. Puletti et al. [101] identify, in the red edges acquired in the summer, the best ability to distinguish between conifers, broad-leaved trees, and mixed forests. Furthermore, broad-leaved forests tend to be confused with "transitional woodland-shrub". This is believed to be related to the fact that the species referable to code 324, especially in an advanced successional stage, are always species referable to broad-leaved trees and therefore easily confused with "broad-leaved forests" (code 311). Ultimately, even greater accuracy results are to be expected using multitemporal data (seasonal images in various phenological stages) [17,28,84,128].

Conclusions
In this study, ML and SVM classifiers were compared using Landsat OLI and Sentinel-2 mono-temporal images (both acquired in August 2017) and testing both classifications that use comparable predictors (bands) between Landsat and Sentinel and classifications that consider also the Sentinel-2 red-edge bands among the predictive variables. Particular attention was paid to the selection of ground truth for both training and validation, this being a fundamental element, now widely recognized for obtaining good-quality classifications, and being more important also for the same classification algorithm. The results showed that ML classifications compared to SVM always provide higher levels of accuracy: with the OLI bands, we achieved accuracy of~89%, while the SVM classification conducted with the same bands presents OA =~78%. Then, the performance of the Landsat 8 OLI datasets and the Sentinel-2 ones always show the best efficiency for the Landsat data: the ML classifications on the datasets that use comparable bands between the two sensors show differences, in terms of total accuracy, of~8%. Finally, it was verified that the contribution in the classifiers of the red-edge bands in the Sentinel-2 dataset improves, with ML, the accuracy by~5%, with performance therefore very close (difference of~2%) to that of the ML classification conducted with the Landsat bands.
Definitively, the study confirms the fact that there is no univocal consensus in the literature on the type of classifier to be used and on the type of image to be used. This depends on the land use classes taken into consideration, their spatial distribution, the number of predictors used, and the general characteristics of the study area. It is therefore important to try and test, through accuracy metrics, various types of classifiers (both machine learning and parametric) and various types of satellite images to produce the optimal result for the area of interest.