Integrating Semi-Supervised Learning with an Expert System for Vegetation Cover Classification Using Sentinel-2 and RapidEye Data

: In complex classification tasks, such as the classification of heterogeneous vegetation covers, the high similarity between classes can confuse the classification algorithm when assigning the correct class labels to unlabelled samples. To overcome this problem, this study aimed to develop a classification method by integrating graph-based semi-supervised learning (SSL) and an expert system (ES). The proposed method was applied to vegetation cover classification in a wetland in the Netherlands using Sentinel-2 and RapidEye imagery. Our method consisted of three main steps: object-based image analysis (OBIA), integration of SSL and an ES (SSLES), and finally, random forest classification. The generated image objects and the related features were used to construct the graph in SSL. Then, an independently developed and trained ES was used in the labelling stage of SSL to reduce the uncertainty of the process, before the final classification. Different spectral band combinations of Sentinel-2 were then considered to improve the vegetation classification. Our results show that integrating SSL and an ES can result in significantly higher classification accuracy (83.6%) compared to a supervised classifier (64.9%), SSL alone (71.8%), and ES alone (69.5%). More-over, utilisation of all Sentinel-2 red-edge spectral band combinations yielded the highest classification accuracy (overall accuracy of 83.6% with SSLES) compared to the inclusion of other band combinations. The results of this study indicate that the utilisation of an ES in the labelling process of SSL improves the reliability of the process and provides robust performance for the classification of vegetation cover.


Introduction
Accurate mapping of vegetation cover in an ecosystem can help to initiate its protection and restoration programmes efficiently [1]. It is, therefore, necessary to acquire accurate and up-to-date information about the status of the vegetation cover of an ecosystem through regular monitoring. Traditional field inventory monitoring methods of vegetation covers are usually expensive and time-consuming [2]. Thus, a viable approach can be the use of satellite remote sensing data. Satellite data provide advantages such as large area coverage, ongoing data collection, and cost-effectiveness for monitoring and mapping purposes [3,4]. The advent of remote sensing technologies, with high-resolution multispectral satellite sensors, has provided new opportunities to monitor vegetation cover at different spatial and temporal scales. For instance, Sentinel-2 MSI, covering a wide spectral range (400-2400 nm) including three red-edge bands, may allow more efficient discrimination of different vegetation types.
A common approach to extracting information from satellite data, to distinguish different vegetation types, is using image classification algorithms [5,6]. Usually, there are three common challenges involved in image classification that could affect its accuracy: (1) collecting a sufficient number of training samples, (2) creating a balanced training and test set, and (3) fine-tuning the algorithm parameters to obtain the optimum performance [7][8][9]. A relevant solution to these challenges is the introduction of the semi-supervised learning (SSL) technique [10], which uses relatively few labelled samples and a large number of unlabelled data to train a model [11].
The existing paradigms of SSL for the classification of remote sensing data can be divided into four major categories: (1) generative mixture models, such as expectationmaximisation algorithms [12,13], (2) low-density separation algorithms, such as transductive support vector machines (TSVMs) [14,15], (3) self-learning methods [16][17][18][19], and (4) graph-based methods [20][21][22][23]. Among the SSL methods, graph-based approaches have recently received significant attention due to their ability to provide a relatively high classification accuracy while retaining computational simplicity [24][25][26][27]. While graph-based algorithms can improve the classification performance by using the distribution of unlabelled samples, they have some limitations [9]. One of these limitations occurs in complex classification tasks, such as identifying different classes in heterogeneous vegetation covers. In such a case, samples from the same vegetation class may show low similarity (i.e., high intra-class variability), and two samples from two different vegetation classes show high similarity (i.e., low inter-class variability). This "similarity" problem can confuse the graph-based algorithm, and the semi-labelled samples may not have the correct label. In this case, unlabelled samples can be detrimental to the graph-based algorithm, as they may degrade the accuracy by misguiding the classifier [11]. One of the common approaches to tackling the similarity problem is using non-parametric classifiers since they do not make any underlying assumptions about the distribution of data [7]. However, these classifiers require a representative amount of training data to estimate the mapping function, and they are also subject to overfitting. Consequently, in studies such as vegetation cover classification where there might be an imbalanced distribution of features and training samples, these classifiers would underperform [28]. To help solve the mentioned problem of SSL, this study aimed to use expert knowledge within an independently developed and trained expert system (ES) in the labelling process of a graph-based algorithm. The contribution of an ES can help the problem by refining the semi-labelled samples. In this context, expert knowledge is defined as the experience and existing knowledge of the expert in the specific domains of study, technical practices, and prior information on the study area [29,30]. The developed ES should have the ability to classify the unlabelled samples using expert knowledge, independently from SSL. This ability of the ES can help SSL to assign the most certain class label to the unlabelled samples by filtering out samples with less certain labels.
Motivated by the above insights, in this study, a novel classification approach was proposed for the classification of satellite images by integrating graph-based SSL and an ES (SSLES). The main idea of the proposed approach was to construct a graph in SSL, based on image features, and use an ES in the labelling process of SSL to assign the most probable class labels to the selected unlabelled samples and then perform the classification using a standard supervised classifier. The study specifically aimed to address two objectives, as follows: (1) to investigate the performance of SSLES for vegetation cover classification and (2) to investigate the potential of Sentinel-2 spectral data for vegetation cover classification.

Study Area
The study area was Schiermonnikoog Island in the Netherlands, located between 5327′20"N-5330′40"N latitude and 0606′35"E-0620′56"E longitude, with an area of 199.1 km 2 ( Figure 1). The vegetation cover on the south and south-east shore of the island has adapted to the regular inundation of seawater and has formed a salt marsh [31]. In this study, vegetation species of the island were categorised into 10 functional groups for classification based on the reference vegetation map [32], namely: high matted grass, low matted grass, agriculture, forest, green beach, tussock grass, high shrub, herbs, low salix shrub, low hippopahe shrub. The natural vegetation cover has a large spatial and temporal variability, due to the dynamic influences of the tide, wind, and grazing [33,34]. This area was chosen to test the proposed classification methodology as it is representative of a diverse and mixed vegetation cover.

Sentinel-2 Data
The main satellite imagery used in this study was the standard Sentinel-2 Level-1C product, which is in UTM/WGS84 projection, and its per-pixel radiometric measurements are provided in top of atmosphere (TOA) reflectance [35]. The Sentinel-2 image of the study area was acquired on 17 July 2016 belonging to the relative orbit of R008 and was downloaded from ESA Sentinel-2 Pre-operation Hub (https://scihub.copernicus.eu/, accessed on 30 July 2016). The image was chosen from July to obtain a cloud-free image with vigorously growing vegetation. The atmospheric correction of the image was performed using Sen2Cor software [35], and the top of canopy (TOC) reflectance was calculated for further analysis.
Sentinel-2 offers a multispectral sensor in 13 bands from 443 to 2190 nm with three different geometric resolutions as follows: In this study, all the Sentinel-2 bands were resampled to 5-meter resolution for processing so that they had the same resolution as the RapidEye image.
To achieve a higher classification accuracy and assess the capability of Sentinel-2 data in classifying the vegetation types, its spectral bands were combined into different groups for subsequent assessment to find the most informative band combination for vegetation classification accuracy. Based on previous studies, the most important regions of the spectrum to study vegetation cover are the red-edge, shortwave, and red-infrared regions [4,[36][37][38][39][40]. Consequently, six groups of band combinations were considered to classify vegetation cover, as follows:


Group 1: All spectral bands;  Group 2: Red and infrared bands;  Group 3: All shortwave infrared bands;  Group 4: All red-edge bands;  Group 5: Red, infrared, and red-edge bands;  Group 6: Red-edge and shortwave infrared bands.

RapidEye Data
In this study, a RapidEye image was also acquired for Schiermonnikoog Island on 18 July 2015. The pre-processed data were obtained at level 3A, which means radiometric and geometric corrections, as well as geo-referencing, were applied. The image covers 25 km × 25 km with the orthorectified pixel size of 5 m × 5 m. Due to clear weather conditions during the image acquisition, no further atmospheric correction was applied. Both Sentinel-2 and RapidEye images were chosen from a similar time of year to ensure that in both, the vegetation cover is as alike as possible. In this study, the RapidEye image was used for segmentation only, and the features were extracted from Sentinel-2 data.

Reference Data and Sampling
The reference data used in this study included field observations of dominant vegetation species for 30 vegetation plots (30 m × 30 m) collected in July 2015 and a vegetation map belonging to 2010 [32]. This map was obtained from experts' visual interpretation of aerial photographs (1:10,000) combined with extensive field inventory, and it included the same vegetation classes as this study.
To select the training and test samples, stratified random sampling was implemented, where each vegetation class was considered a stratum [41]. The resulting training sample size became 650 for the 10 vegetation classes, and the samples were extracted from the vegetation plots. Table 1 reports the number of samples per stratum. In addition, 434 more samples were identified as an independent test set (2/3 of the number of the training samples). A sample in the context of this study is referred to as an image object (as the result of image segmentation), representing a vegetation patch on the ground. In this study, three main sources of knowledge were identified to be used as input to build the ES knowledge base. These sources were different and separate from the reference maps used for the sampling process:


A reference vegetation map of the study area, generated in 2010 [32]. As this map was generated with experts' visual interpretation of aerial photographs and extensive fieldwork, it contained some level of experts' knowledge.  Ancillary data, including field records of dominant vegetation types for 30 vegetation plots, NDVI data from the Sentinel-2 image, and a digital elevation model (DEM) of the island (produced from laser altimetry by the Dutch ministry of public works, Rijkswaterstaat) to generate height, slope, aspect, etc.  Published resources about vegetation cover in Schiermonnikoog and its ecology [31,[42][43][44].

Methods
The architecture of the proposed classification approach contained three parts: object-based image analysis (OBIA), semi-supervised learning and expert system (SSLES), and classification. Using OBIA, the satellite image was segmented to generate image objects and features. Using SSLES, the number of training samples was increased by labelling a set of most informative unlabelled samples. In the final step, classification was performed on the datasets using a standard supervised classifier.

Object-Based Image Analysis (OBIA)
The segmentation method of the mean shift was used to generate image objects [45]. This algorithm requires two parameters to be tuned to obtain an optimal segmentation result:


Spatial radius hs (spatial distance between classes);  Range radius hr (the spectral difference between classes).
The segmentation was performed using a 5 m × 5 m RapidEye image, due to its finer spatial resolution compared to Sentinel-2. This could result in a higher spatial accuracy of image objects [46]. For a quantitative evaluation of the segmentation results, the method proposed by [47,48] was used which measures both topological and geometric similarity between the segmented and the reference objects. The method relies on the ratio of the intersected area of the segments and the reference objects, and depending on the size of overlap for each segment and object, over-and under-segmentation indices are calculated.
Using the segmented map and the reference vegetation map, labelled and unlabelled objects were generated. For this, the segmented map was overlaid with the reference samples' layer (obtained from the sampling process that contained training and test samples). Any image object that contained the centroid of a reference sample and had more than 50% overlap with it was treated as a labelled object. The rest were considered unlabelled objects.
Sentinel-2 data were resampled to 5 m resolution to match the spatial resolution of RapidEye data and then be able to extract the features using the segmentation results. For mapping vegetation, three categories of features were considered, which have been recognised as important in previous studies [37,[49][50][51][52][53], as follows: (a) a set of spectral features consisting of the mean, standard deviation, median, and minimum and maximum values of pixels within an image object, (b) a set of textural features including GLCM (grey-level co-occurrence matrix) and GLDV (grey-level difference vector), and (c) a set of geometrical features representing the area and perimeter of the image objects.
The final results of OBIA are image objects with a corresponding informative set of image features.

Semi-Supervised Learning
The graph-based semi-supervised learning (SSL) method was used to increase the training samples, before classification. This method aims to construct a graph G = (V, E) connecting similar samples. V consists of N = L + U samples, where L and U are the numbers of labelled and unlabelled samples, respectively. Edges E are represented typically by a symmetric similarity weight matrix W∈R NxN [26]. The k-nearest neighbour (KNN) approach was used to construct the weight matrix, denoted as W W : where is the Gaussian kernel bandwidth, − is the similarity measure between two samples, and ( ) is a set of K nearest neighbours of sample xi. In this study, the similarity measure between two samples was based on image features obtained from OBIA.
To construct the graph and assign the class labels to the unlabelled samples, the energy function proposed by [54] was used to be optimised, defined as follows: where f = (fL, fU) T is composed of fL and fU which are the predicted class labels of the labelled and unlabelled samples. is the vector class label of the sample i. ∆ is a graph Laplacian matrix obtained by ∆ = D − W, and D is the diagonal degree matrix given by Dii = ∑jWij.
The label propagation technique was employed at the end to propagate the information through the graph to the unlabelled samples [11,55]. For this purpose, the weight of the edges in the graph was computed, according to Equation (2), and then the probability matrix was estimated as = .
The edge with the highest probability is the determiner of the label for the unlabelled sample.

Expert System
The expert system (ES) approach used in this study was described in detail by [56]. The ES, here, was developed to answer the question of "What vegetation type is probable to occur in a given image object?" in reference to the samples that obtained labels with SSL.
Bayes' theory was used in the ES to compute the probability of the rule that the hypothesis (Ha) occurs in an image object given a piece of evidence (Eb), i.e., where P(Eb|Ha) is the a priori conditional probability that there is a piece of evidence Eb (e.g., a mean slope of less than 0.1°) given a hypothesis Ha (e.g., "high shrub" class) that class Ci occurs in a specific image object. P(Ha) is the probability for the hypothesis (Ha) that class Ci occurs in an object. P(Eb) is the probability that an object has an item of evidence {Eb}. The following steps were taken to compute the probability rules:  Generate a histogram population of each feature layer in the knowledge base;  Divide each histogram into 10 quantiles, representing the frequency of the occurrence of each class at each percentile of the feature layer;  Normalize the frequency values by fitting a normal distribution.

SSLES Algorithm
Algorithm 1 demonstrates the inputs, output, and steps of SSLES. In this algorithm, which is fully implemented in MATLAB_R2015b, the inputs are image objects with the respective set of features. Further, steps 1-4 are related to SSL which generates a graph based on image features and propagates the class labels to the potential unlabelled samples. In step 5, the developed ES performs an independent class label prediction on the semi-labelled samples of the SSL in step 4. Finally, if both the ES and the SSL agree on the same class label, the labelled sample is added to the training set; otherwise, it is returned to the unlabelled set. The flowchart of the algorithm is presented in Figure 2.

Classification and Evaluation
The final step is classification, where a standard supervised classifier is implemented. This step is performed using the random forest (RF) classifier [57,58]. This classifier has two parameters that needed to be set:


The number of classification trees, i.e., the number of bootstrap iterations (ntree);  The number of input variables used at each node (mtry).
Several studies have demonstrated that the default value for mtry can provide satisfactory results [59][60][61]. Therefore, this parameter is set to the default value, i.e., the square root of the total number of input features.
In this study, the RF classifier was used in three different scenarios. In the first scenario, the training set generated from SSLES was used to train the classifier; in the second scenario, the training set generated from SSL was used to train the classifier; and in the third scenario, the classifier was trained with the original training set.
For the validation of the classification results, the accuracy assessment based on the error matrix was conducted [62]. The evaluation was performed using the same test samples extracted from the reference data, and the results were evaluated in terms of overall accuracy (OA) and Cohen's kappa coefficient. To assess the statistical significance of the difference between the obtained accuracies, McNemar's test was used with a 95% confidence level and 1 degree of freedom.

Object-Based Image Analysis
To start the image segmentation process, the parameters were tuned and evaluated. hs and hr were iterated in the potential range of [1:10]. Using sensitivity analysis, hs = 5 and hr = 7 were chosen as the optimum values. As the result of image segmentation, a total number of 5230 image objects were delineated with a mean size of 30 pixels. Figure 3 illustrates the final segmented objects on the false colour composite Sentinel-2 image of the study area. Next, the image objects were divided into three data sets: training, test (labelled), and unlabelled objects. The numbers of generated datasets are listed in Table 2.

Semi-Supervised Learning
To depict the idea of label propagation, a graph is shown in Figure 4 representing a set of labelled and unlabelled samples. The labels are propagated from the labelled to the unlabelled samples based on the probabilities of the arrows. In this example, although two "Herbs" samples are connected to the unlabelled sample in the centre, it is labelled as "Forest" since the probability value of the "Forest" sample is higher than "High Shrub" and two "Herbs". The final output of SSL is semi-labelled samples that might have incorrect labels. Following the generation of semi-labelled samples, an ES was used to classify the semi-labelled samples again, in parallel to SSL. Samples that obtained the same class label as SSL were merged with the original training set to generate a new extended training set; otherwise, they were moved back to the unlabelled pool.

Expert System a Priori Probabilities
The reference vegetation map was used to estimate the a priori probabilities for the vegetation classes and the initial conditional probabilities for all the feature layers (Table  3). The ancillary data were used to extract training samples, and then their feature values were statistically analysed, i.e., using the methodology explained in Section 3.3, to define the probability of occurrence of a vegetation type within an image object. The result is six sets of rules for each feature layer that contains the probability values of occurrence of evidence (i.e., a set of features) given a hypothesis (i.e., specific vegetation type). These rules belong to the mean and standard deviation values of the feature layers ( Figure 5).

Figure 5.
Example of expert rule weights for ancillary data. Y-axis shows the initial conditional probability, and the X-axis shows the 10 quantiles. Figure 5 illustrates how rules were derived for the mean values of the ancillary data. These probability rules were applied to each input sample in order to compute the probability values. The black bars indicate the probability of that sample belonging to each class. Based on three derived facts from the published resources, three more rules were generated. The derived facts were (1) the dependency of some classes, e.g., herbs, on water availability (streams), (2) the presence/absence of some classes around the residential areas, e.g., green beach, and (3) a forestry programme adjacent to the village [63]. Using the method described before, the distances of the vegetation classes from the water streams and the residential area were analysed and then divided into three quantiles, and the probability of occurrence of vegetation classes at each distance quantile was computed (Table 4). Table 4. Expert rules are based on the distance of the vegetation classes to streams and residential areas. Values describe the probability of occurrence of vegetation class at each distance quantile.

Vegetation Classes
Distance to Streams Distance to the Residential Area Quantile 1 Quantile 2 Quantile 3 Quantile 1 Quantile  According to the extracted probability rules in Table 4, each sample's feature values were examined and depending on the percentile that it lay in, a probability value was assigned. Eventually, by an iteration through the nine rule sets, samples gained nine sets of probabilities for each class. Then, these probabilities were merged into a final combined probability, and the class with the highest probability value was the class of that sample.

SSLES Results
After running SSLES, 1513 new samples were labelled. As part of the process, these newly labelled samples were combined with the original training set to generate the new training set. The table below (Table 5) summarizes the number of training samples for each class.  HMG  160  494  654  LMG  142  290  432  Ag  71  167  238  Fr  58  121  179  GB  58  106  164  TG  45  99  144  HS  45  89  134  Hr  35  67  102  LSS  25  48  73  LHS  11  32  43  Sum  650  1513  2163 Unlike most SSL implementations that exploit the labelled samples iteratively until all the samples have a label, in this study, graph-based SSL was followed by a basic supervised classifier. As shown by [64], this implementation has the advantage of reducing the computational complexity of the algorithm and can classify more new samples as well.

Parameter Tuning
To evaluate the performance of SSLES, three classification scenarios were conducted with SSL only, ES only, and RF methods for comparison. Before conducting the experiments, three parameters needed to be tuned to obtain the optimum results. For this, the values of the parameters were changed in a potential range, and the values yielding the highest OA were selected as the final parameters' values. The parameters were the k-nearest neighbour and kernel bandwidth belonging to the SSL and ntree belonging to the RF classifier, which were tuned in the range of = {1,2, … ,20}, = {0.1,0.2, … ,2}, = {10,20, … ,800}, respectively, based on previous studies. The values of parameters for graph construction were set as k = 16 and σ = 0.2, and regarding the RF classifier, the parameter was set to = 80 which led to the highest OA.

Classification Evaluation
After tuning the parameters, the experiments were carried out. Table 6 reports the obtained classification scores for the six groups of band combinations using four methods. Table 6. Classification results for the six groups of band combinations of Sentinel-2 data, in terms of OA and kappa coefficient. Group 1: (All spectral bands), Group 2: (Red and infrared bands), Group 3: (All shortwave infrared bands), Group 4: (All red-edge bands), Group 5: (Red, infrared, and red-edge bands), Group 6: (Red-edge and shortwave infrared bands). As it can be observed in Table 6, SSLES produces relatively higher accuracy for all of the six datasets when compared to SSL alone, ES alone, and RF. The highest accuracy was achieved using only the red-edge bands. Figure 6 illustrates the final classified map of the study area with Group 4 of band combination, using the SSLES for classification.  The confusion matrix of SSL is shown in Figure 8 to report the producers' accuracies of vegetation classes. Results are presented as a 2D plot where colours represent the accuracy of each class. In Figure 8, the highest accuracy classifications have a light/white colour in the main diagonal and dark/black colour in the other cells, which means no misclassification.

Discussion
The obtained results in Table 6 show that SSLES can yield higher classification accuracy than SSL. Furthermore, using the red-edge bands provided the highest accuracy, which confirmed the findings of [3] where the importance of the Sentinel-2 red-edge bands for vegetation classification was highlighted. As is shown in Figure 8, a considerable number of "Herbs" are classified as "High matted grass", and the classifier has confused the "Low matted grass" and "High matted grass" with other classes. This can probably be justified by (i) the high diversity of these two classes in the study area and (ii) the low diversity of training samples belonging to these classes. To gain better insight into SSLES's performance and the advantage of using an ES integrated with SSL, the confusion matrix obtained from SSLES was subtracted from the one from the SSL, and the results are presented in a new matrix. In the resulting matrix, if the value of a cell increased, it is labelled "positive change"; if the value decreased, it is labelled as "negative change"; otherwise, the cell was given a "no change" label ( Figure 9).

Figure 9.
A detailed comparison of two classification methods, the matrix is the result of subtracting the confusion matrix of the SSLES from the SSL. The white colour represents an increase in the cell's value, black indicates a decrease, and if the value was unchanged, it was coloured cyan.
Comparing the obtained matrices reveals that the classification accuracies of all vegetation classes improved, except for five off-diagonal cells. In an ideal situation, it is expected that all the positive changes happen in the main diagonal elements and not in the off-diagonal elements, which means a decrease in misclassification. This result implies that the contribution of the ES to the labelling process of SSL has the advantages of removing the less reliable semi-labelled samples and increasing the overall quality of training samples obtained by SSL.
The performance of SSLES can be discussed by considering two perspectives. The first lies in constructing the graph in SSL, which is based on image features rather than spatial neighbourhood and spectral similarity. This could help to obtain a better estimation of the underlying class label of the potential unlabelled image objects. The second perspective is related to the role of the ES in increasing the certainty of labelling in SSL. The ES handled this through the use of probability rules that aimed to link environmental parameters and the location of vegetation types, where it is most likely that a vegetation type may occur.
To have a benchmark to evaluate and compare the performance of SSLES, it is assumed that the number of reference training samples is representative and sufficient to train a standard supervised classifier. Since SSLES increased the number of samples in the training set, there could be the risk of overfitting the classifier due to the high number of training samples. Therefore, to test the robustness of SSLES regarding the number of initial training samples, a new test was conducted using the Group 4 dataset. For this, only 50% of the original reference training samples were used for SSLES, and the result was evaluated by the same test set that had been used previously. This resulted in an OA of 80% and compared to the case of using all the training samples, no statistically significant difference was observed. The classification accuracies could have been negatively affected by various factors. These include the uncertainties associated with the samples obtained from OBIA. In OBIA, if segmentation has low quality, it may not be able to separate two different classes properly in the image; hence, the extracted features for an image object will be the mix of features of two different classes [65,66]. To avoid this problem in the current study, the segmentation results were compared to the reference sample polygons. Nevertheless, in the case of having uncertainty in the reference vegetation map, some level of uncertainty may be found in the segmentation. The segmentation was performed using a RapidEye 5 m image while the features were extracted from the Sentinel-2 image. Although there was a one-year gap between the two images, they were acquired at the same time of year. In such a relatively short time period, no changes are expected in the vegetation structure of the study area except for the agriculture class. This was further confirmed by the high-resolution satellite imagery of GoogleEarth. The limitations of SSLES can be discussed considering the restrictions of SSL and the ES. In terms of the ES, assigning the quantitative values for the a priori probabilities has some uncertainties since it is based on the reference vegetation map mainly. Using the knowledge and experience of specialised experts such as ecologists could result in a stronger knowledge base, but due to time constraints, it was not possible. However, the assumption of this study was to use any available source of (prior) knowledge and expertise. The sources of knowledge could be either in the form of a human expert or other resources such as scientific research and published works. Regarding SSL limitations, the KNN method was used for graph construction. A recent study by [67] showed that the KNN may result in irregular graph construction where each node is connected to more than K neighbours. In this case, the algorithm may end up assigning the incorrect class label to the connected nodes. This problem might be more pronounced in the present study because of the high similarity between the vegetation classes.

Conclusions
In this study, a developed approach for semi-supervised classification of satellite images was proposed and applied for vegetation cover classification. The algorithm constructs a graph based on image features from OBIA. It uses Euclidean distance to compute the similarity between samples, where KNNs are selected for labelling. Using an ES to supervise the labelling process of the graph-based SSL algorithm was the key point in this study.
The capability of OBIA, SSL, and the ES for classification, particularly in the field of remote sensing, has already been investigated in the literature. The novel contribution of this study was the integration of these three into a single algorithm. Results prove the effectiveness of the proposed algorithm for the challenging problem of vegetation cover classification where some vegetation classes show similar characteristics. The capability of Sentinel-2 spectral data in vegetation classification was assessed, and the results prove that the red-edge band's combination could yield the highest overall accuracy for vegetation cover classification.
In a future study, linking the vegetation classification levels to the concept of Anderson levels for land cover mapping could be considered to adjust the vegetation classes where the highest level has the highest accuracy [68]. From an algorithm perspective, the potential applicability of the SSLES method on different land covers and biomes using different remote sensing data could be analysed. Concerning SSL, different strategies need to be investigated for selecting the unlabelled samples in a more informative and reliable way. Using alternative approaches for graph construction instead of KNN, as well as us-