Designing a European-Wide Crop Type Mapping Approach Based on Machine Learning Algorithms Using LUCAS Field Survey and Sentinel-2 Data

Ghassemi, Babak; Dujakovic, Aleksandar; Żółtak, Mateusz; Immitzer, Markus; Atzberger, Clement; Vuolo, Francesco

doi:10.3390/rs14030541

Open AccessArticle

Designing a European-Wide Crop Type Mapping Approach Based on Machine Learning Algorithms Using LUCAS Field Survey and Sentinel-2 Data

by

Babak Ghassemi

,

Aleksandar Dujakovic

,

Mateusz Żółtak

,

Markus Immitzer

,

Clement Atzberger

and

Francesco Vuolo

^*

Institute of Geomatics, University of Natural Resources and Life Sciences, Vienna (BOKU), Peter-Jordan-Straße 82, 1190 Vienna, Austria

^*

Author to whom correspondence should be addressed.

Remote Sens. 2022, 14(3), 541; https://doi.org/10.3390/rs14030541

Submission received: 16 December 2021 / Revised: 18 January 2022 / Accepted: 19 January 2022 / Published: 23 January 2022

(This article belongs to the Special Issue Remote Sensing Applications in Agricultural Ecosystems)

Download

Browse Figures

Versions Notes

Abstract

:

One of the most challenging aspects of obtaining detailed and accurate land-use and land-cover (LULC) maps is the availability of representative field data for training and validation. In this manuscript, we evaluate the use of the Eurostat Land Use and Coverage Area frame Survey (LUCAS) 2018 data to generate a detailed LULC map with 19 crop type classes and two broad categories for woodland and shrubland, and grassland. The field data were used in combination with Copernicus Sentinel-2 (S2) satellite data covering Europe. First, spatially and temporally consistent S2 image composites of (1) spectral reflectances, (2) a selection of spectral indices, and (3) several bio-geophysical indicators were created for the year 2018. From the large number of features, the most important were selected for classification using two machine-learning algorithms (support vector machine and random forest). Results indicated that the 19 crop type classes and the two broad categories could be classified with an overall accuracy (OA) of 77.6%, using independent data for validation. Our analysis of three methods to select optimum training data showed that by selecting the most spectrally different pixels for training data, the best OA could be achieved, and this already using only 11% of the total training data. Comparing our results to a similar study using Sentinel-1 (S1) data indicated that S2 can achieve slightly better results, although the spatial coverage was slightly reduced due to gaps in S2 data. Further analysis is ongoing to leverage synergies between optical and microwave data.

Keywords:

crop type classification; random forest; support vector machine; LUCAS 2018

1. Introduction

Information on land use and land cover (LULC) is crucial for spatial modeling and monitoring of the global hydrological and carbon cycle, energy balance, and status of natural resources [1,2,3]. A repeated, transparent and precise LULC monitoring is also essential for addressing rapid changes in land use such as soil sealing and agricultural production [4]. The most needed are LULC maps that can display the dynamic land-use changes, such as in arable land.

Satellite-based remote sensing is the most appropriate tool for LULC mapping [5] because of its global continuous and regular coverage and cost-efficiency [6]. The abundance of freely available remote-sensing data offers unprecedented opportunities to produce land cover, and land cover change maps over large-scale areas [7].

The Sentinel-2 (S2) satellites are part of the Copernicus European Union’s Earth Observation (EO) Programme and represent one of the most promising instruments to monitor dynamic land cover changes in a timely way [8]. The two twin satellites carry a multispectral sensor with 13 spectral bands (four bands at 10 m, six at 20 m, and three at 60 m spatial resolution). With the two satellites in operation, the temporal resolution is 5 days.

The use of S2 data for LULC monitoring applications has been widely documented in several papers. Xiong et al. [9] produced a nominal 30 m binary cropland map (cropland and non-cropland) of the entire African continent using S2 and Landsat-8 (L8) data for 2015 and suggested the application of 10 m S2 data to address limitations in the classification of small fragmented fields. Inglada et al. [10] also combined S2 and L8 time series data to produce a country-wide crop map for France, achieving a kappa value of 0.86 considering 17 classes. Malinowski et al. [4] created a LULC map for a large part of the European continent for 2017 using S2 data, with 13 classes and average overall accuracy (OA) of 86.1% with a methodology potentially suitable for a fully automated classification at a relatively high frequency. Dostálová et al. [11] produced a two-class forest-type map (coniferous and broadleaved) for Europe using Sentinel-1 (S1) for 2017 (achieving an OA of 82.7%). The authors found S1-inherent problems related to the side-viewing observation of SAR sensors for complex terrains that negatively impacted the accuracy. Another publicly released 10 m spatial resolution and global LULC map was derived by (deep) neural nets [12] using S2 data with an OA of 86.0%. In the European Space Agency’s (ESA) WorldCover project, S1 and S2 data were combined to provide a global land cover map for 2020 at 10 m resolution with 11 land-cover classes [13]. Their algorithm achieved an OA of 74.4%, in which the accuracy of the shrubs class and the separation between tree cover and shrubs areas could be improved. Defourny et al. [14] classified five main crop groups and non-cropland areas in three countries (Ukraine, Mali, South Africa) and five local sites distributed across the world using S2 and L8 data with an OA value higher than 80% for all sites, except one. Jiang et al. [15] produced a high-resolution map of major crop types for three large regions in China (2–3 crop types per region), achieving an average OA of 94%.

One of the major issues for deriving accurate and detailed LULC maps is the availability of field data for training (and validation) purposes. In this regard, one concerted effort is represented by the European Land Use and Cover Area frame Survey (LUCAS)-organized by Eurostat [16]. LUCAS is a regular in situ survey performed every 3 years (since 2006) to collect land cover data in the European Union (EU). The most recent survey (2018) was further improved to respond to the needs of the EO community with the introduction of the “Copernicus module” point data. These are points characterized by homogeneous land cover types, and therefore they are directly comparable to S1 and S2 pixels. In 2020, d’Andrimont et al. [17] further revised the Copernicus module data by constructing polygon geometries that spatially represent the homogenous regions of approximately 0.5 ha around each point. The polygon geometries aimed at further increasing the availability of field data–however, care should be taken not to sample more than one pixel from the same polygon as this would bias the results.

Over recent years, the LUCAS data offered the opportunity to develop and advance LULC mapping capacities for Europe. For example, Close et al. [18] used the LUCAS 2015 survey and S2 data to classify the Wallonia region in Belgium for monitoring land use, land-use change, and forestry (LULUCF). Pflugmacher et al. [19] produced a pan-European land cover (13 classes) map using LUCAS 2015 survey and Landsat-8 data. Weigand et al. [20] utilized the LUCAS 2015 survey as reference information for high-resolution land cover mapping using S2 data in Germany (seven classes). Venter et al. [21] generated a 10 m resolution land cover map (ELC10) of Europe composed of eight land cover classes with the fusion of S1 and S2 data.

Noteworthy, d’Andrimont et al. [22] derived a 10 m crop type map with S1 data from 28 member states of Europe (EU-28). The study presented a relatively precise and detailed land cover map (19 specific crop type classes beside two broad classes) using a random forest (RF) classifier. Data from the LUCAS Copernicus module (using the polygon geometries) were exploited for training, while the validation was based on an independent set of 87,853 high-quality points filtered from LUCAS core points. The overall accuracy was 74.0% for the 21 classes. The accuracy increased to 79.2% when the classes were grouped into eight broader classes. To date, however, the full potential of S2 data to obtain detailed crop type maps (as the 19 crop types identified in [22]) has not yet been demonstrated.

To evaluate the potential of S2 time series for the production of detailed LULC maps, this manuscript reports on the potential of the LUCAS 2018 data to derive detailed EU-wide crop type maps at 10 m spatial resolution based on a spatially and temporally harmonized S2 time series. As different classification algorithms often yield different results [23], this research also includes a comparison between support vector machine (SVM) and random forest (RF). Both methods are relatively insensitive to noise and overtraining, and they can deal with imbalanced data [24]. To compare the results of this research to previous findings, the classification scheme used by [22] was also adopted here.

The specific objectives of this investigation are as follows:

To evaluate the potential of using the LUCAS 2018 data in combination with S2 data for generating detailed LULC maps.
To compare the classification performance of S2 and S1, using as [16] a reference study for S1.
To compare the performance of RF and SVM classifiers.
To study the impact on the classification accuracy due to autocorrelation in field data (considering data sampled from the LUCAS Copernicus polygon geometries).

2. Materials and Methods

The area of interest is the EU-28 territory containing LUCAS 2018 survey data. The main steps shown in Figure 1 included the extraction of S2 spectral and temporal features at the point locations of the LUCAS data, followed by the SVM and RF hyperparameter tuning, training, and assessment of the results. A final LULC map was generated for EU-28 regions by combining the best model and an additional model (based on yearly indicators only) to fill remaining spatial gaps (15% of land area that could not be covered with the optimal feature).

2.1. Sentinel-2: Data Preparation, Spectral and Temporal Features

The processing chain relied on S2A and S2B L2A data which are processed at the BOKU/EODC data facility and accessed via dedicated APIs [25]. Only S2 image tiles with cloud cover below 50% were included. A fully automated processing chain was created to produce cloud-masked S2 L2A data at 10 m spatial resolution, all projected to the same spatial reference (EPSG: 3035) and extent (EU-28). The six S2 bands at 20 m pixel size were resampled at 10 m using the nearest neighbor algorithm.

Starting from this dataset, yearly and monthly cloud-free image composites were computed for the year 2018. The features used in this study are summarized in Table 1.

Yearly indicators contain the 5th, 50th, and 98th centiles of spectral indices (BLFEI [26], BSI [27], MNDWI [28], NDBI [29], NDTI [30], and NDVI [31]) calculated across the year, day of the year with maximum NDVI value, and two climate variables (average yearly temperature and rainfall) derived from the WorldClim (v2 at 30 m) dataset [32].

Monthly indicators encompass monthly composites of B02-B08, B8A, B11, and B12 from S2 spectral bands, bio-geophysical indicators (FAPAR, FCOVER, and LAI) [33], and NDVI. To build the monthly composite, every pixel was taken from the S2 acquisition presenting the maximum NDVI value over the month. In total, 189 features were obtained.

2.2. Preparation of the Training Data

2.2.1. European Land Use and Cover Area Frame Survey (LUCAS) 2018 Data

The LUCAS survey was conducted in 2006, 2009, 2012, 2015, and 2018. The 2018 survey was composed of 337,854 points (subsampled from a master grid of 1,090,863 points with a regular distribution of 2 km × 2 km) which are shown in Figure 2.

The smallest area of observation is a circle with a 1.5 m radius. About 70% of core points were surveyed on the ground, and the rest were photo-interpreted in the office (Figure 3). The land cover of these points was labeled according to three different classification schemes. The level-1 legend includes eight broad land cover groups, while level-2 and level-3 comprise detailed classes with 26 and 66 categories, respectively [34].

The regular LUCAS survey was expanded by the so-called Copernicus module to address the needs of the EO community [34]. The Copernicus module was originally planned for 90,620 points out of the main 337,854 core points. Due to difficulties in accessing the locations, however, only 63,364 points could be sampled. A summary of the LUCAS survey data is presented in Table 2.

2.2.2. LUCAS 2018 Data Refinement Due to Gaps in Sentinel-2 Data

To delineate the polygon geometries for each of the LUCAS Copernicus points, d’Andrimont et al. [17] expanded the point geometries in the four cardinal directions (up to 51 m distance), using the information on LULC homogeneity reported in the field during the survey. Among 63,364 polygons, 77 polygons were discarded due to geolocation problems. The remaining 63,287 polygons presented areas from 0.005 ha to 0.52 ha (with an average of 0.32 ha). However, only the polygons for which the level-3 legend was available were kept (resulting in a final set of 58,428 LUCAS Copernicus polygons). Among the 58,428 polygons, S2 spectral and temporal information could be extracted from our S2 dataset only for 56,366 polygons (total area of 1,901,627 individual pixels).

Due to remaining gaps in S2 data, some samples did not have spectral information at a specific time of the year (mostly winter months). This caused a large number of missing values in the monthly composites. The statistics are summarized in Table 3. Therefore, an additional data refinement was necessary, leading to the exclusion of additional 84 features. The final dataset (with 105 features) results in a total of 1,349,052 samples (taken from 43,013 different polygons).

2.2.3. LUCAS 2018 Data Refinement Related to the Classification Scheme

The LUCAS data presents eight level-1 land-cover categories, including A—artificial land, B—cropland, C—woodland, D—shrubland, E–grassland, F—bare land, G—water, and H—wetlands. Since this study focuses on the classification of the main crop types, and to enable comparability with the S1 study described in [22], the class types were redefined. In the chosen classification scheme, only classes and subclasses of B—cropland, C—woodland, D—shrubland, and E—grassland (and a subclass from F—bare land) were considered to create 19 specific crop type classes, plus additional two broad classes, namely woodland and shrubland, and grassland. More details can be found in Table 4 adapted from [22]. According to the defined scheme, the total samples for training the classification models (1,349,052) were decreased to 1,344,885 (taken from 42,753 different polygons) by keeping only the points related to the 21 classes.

2.3. Classification Process

2.3.1. Classification Methods

Two different classification methods were applied in this study: (1) random forest (RF) and (2) support vector machine (SVM). The RF classifier, first introduced by Breiman [35], is a robust machine learning algorithm capable of generating high classification accuracy and quantifying the feature importance. RF is an ensemble technique that utilizes bagging (bootstrap + aggregation) and builds a swarm of random decision trees. In bootstrap, instead of training all data, a subset of samples is trained in a tree of RF. The selected subset is called the bag, and the remnant samples are described as out-of-bag (OOB) samples. The outcome of all trained trees is aggregated, which reduces the variance and increases the classification accuracy. The OOB score can be applied to internally evaluate the trained model (with OOB samples not included during the training process), even without using an independent validating dataset [35,36].

The SVM classifier is a popular kernel-based machine learning presented by Vapnik [37,38]. The SVM algorithm generates an optimal hyperplane to discriminate the dataset into determined classes utilizing training data. The closest samples of different classes in feature space called support vectors are used to maximize the margin. SVM can transform training data into different spaces to find better discrimination between categories using the kernel trick [39].

2.3.2. Training Data Sub-Setting

To study the impact of autocorrelation in the training data (i.e., many points are sampled from the same polygons), three different training subsets were created (see Table 5 the resulting training samples): (1) proportional (one random sample from each polygon), (2) balanced (same number of samples for each class) and (3) keeping all dissimilar samples within each polygon. In details:

Proportional (One random sample from each polygon)

A single random sample per polygon was selected to represent the polygon. The extracted subset contains 42,753 samples (third column in Table 5), equal to the total number of available Copernicus polygons.

Balanced

At least one point per polygon was selected to reach a balanced dataset of 4000 samples per class. In this case, the majority of classes have the same number of samples. However, the number of polygons in the woodland and shrubland and grassland is greater than the reference number (4000), and some polygons were not considered in the sampling. Moreover, as shown in Table 5, some classes have a smaller number of polygons than the reference number. In this case, all available samples in those classes were considered for the balanced subset.

Dissimilar

The 10% most dissimilar samples within each polygon were selected using the similarity matrix (based on the Euclidean distance metric calculated on all samples in an individual polygon).

2.3.3. Feature Selection

A recursive feature elimination method was used to select the important features [40]. In a first step, a RF classification was performed on the training dataset using all 105 features, and the OOB score was calculated. Then, the importance of each feature in the process was extracted. The least significant feature was eliminated from the dataset at the next step. The process was repeated until no features remained.

Using this approach, the difference between the OOB score obtained with the 43 most important features and the OOB score achieved with all 105 features was about 0.3%. Therefore, only these 43 features were further considered for the final classification and accuracy assessment process. The features include: B3 (5th, 7th, 8th, 9th months), B5 (5th, 6th, 7th, 8th, 9th, 10th months), B6 (5th, 6th, 7th months), B8 (5th month), B8A (5th month), B11 (5th, 6th, 7th, 8th, 9th, 10th months), B12 (8th, 9th, 10th months), NDVI (5th, 7th, 8th, 10th months and 5th centile), NDTI (5th, 50th centiles), MNDWI (50th centile), NDBI (5th, 98th centiles), BSI (98th centile), BLFEI (50th centile), LAI (5th month), FAPAR (5th, 7th months), FCOVER (5th, 8th months), TEMP, RAIN. Finally, the model obtaining the highest accuracy was used.

2.3.4. Hyperparameter Tuning

Hyperparameter tuning is an essential step in machine learning, which directly impacts the model performance. Grid search and random search are very popular hyperparameter optimization techniques [41,42]. In the first method, the domain of hyperparameters is discretized to a grid. Then, the performance of all possible combinations is assessed using statistical metrics, here cross-validation (CV). The set of hyperparameters that can maximize the average value in CV is selected as an optimal one for training the model. The random search employs random combinations of hyperparameters instead of using all plausible sets in the grid. These two methods were used in sequence to tune the hyperparameters of RF and SVM classifiers applying to the main training dataset with the 43 most important features.

In the case of RF tuning, a large grid range of values with 1000 possible combinations was defined for the random search (2nd column in Table 6). Then, 100 combinations were assessed randomly with three-fold CV (total 300 fits) multiple times to reduce the grid span. In the second step, all remained 56 combinations of the reduced range (3rd column in Table 6) were examined using the grid search method. The 4th column of Table 6 presents the tuned values for the hyperparameters of the RF classifier, namely n-estimators, max_features, min_samples_split, and min_samples_leaf. The other parameters of the RF classifier were kept in their default settings.

The same procedure was applied for tuning the main parameters of the SVM. In this case, 1800 and 90 fits with three-fold CV were utilized for the random search and grid search, respectively. Table 7 reports the adjusted values for the four hyperparameters C, kernel, degree, and gamma. For the other parameters, the default settings were used.

2.4. Accuracy Assessment

2.4.1. Validation Data

For evaluating the efficiency of the classification models, an independent dataset (not part of the main training dataset) was extracted from the 274,490 LUCAS core points (see Table 2). To select the validation dataset, we followed the procedure outlined in [16]. We retained only those directly interpreted points in the field and were within parcels greater than 0.5 ha with homogeneous land cover. Only samples related to the relevant classes were kept (Table 4). By applying these rules, 91,201 samples were selected. Then, the related features for those points were derived from S2 data. Samples with at least one missing feature were removed. Finally, 70,800 points with 43 features remained for the accuracy assessment.

2.4.2. Assessment Metrics

To evaluate the accuracy, the final classification model was run on the validation data. From the results, the confusion matrix was calculated. Five assessment metrics were derived from the confusion matrix, including user accuracy (UA), producer accuracy (PA), F1-score, OA, and kappa coefficient (KC). The UA presents false positives or errors of commission, while PA displays false negatives or errors of omission. F1-score is the weighted average of UA and PA and is a useful metric for evaluating the classification with imbalanced classes and for class-specific analysis. The OA metric is the ratio of correctly predicted samples to the total samples used for validating the general performance of the model on the validation dataset. The kappa coefficient (KC) was considered for legacy with previous studies [43]. This measure defines the agreement between classified and truth values. Table 8 shows a categorization of the kappa statistics as defined in [44]. The OOB score is additionally used to evaluate the models generated with RF classifiers.

3. Results

3.1. Selecting the Best Classifier and Dataset

The results of RF and SVM classifiers using all training data and the three smaller training subsets are reported in Table 9.

The SVM classifier provided slightly better results in most cases. The most effective outcome was obtained with the ‘Dissimilar’ subset that reached an OA of 77.6%, compared to 77.8% when using all data for training. Consequently, it could be concluded that high accuracy is reachable by applying only the best set of samples, which contains only 11% of the total data and the difference between their OA is only 0.2 percentage points.

Regarding the OOB score, the correlation among samples increases by growing the number of samples, which is evident in the divergence of the OOB and OA, especially when considering all data for training.

3.2. The Accuracy Assessment of the Best Classification Model

The results were interpreted by applying the best model (“Dissimilar” subset for training and SVM algorithm) on the validation data. The detailed confusion matrix for the 21 LULC classes (19 crop types, grassland and woodland and shrubland) is represented in Table A1. The map in Figure 4 shows the classification result for the validation data.

When grouping the 21 classes into eight main classes (the scheme is described in the first column of Table 4), the overall accuracy increased from 77.6% to 82.5% (Table 10). The cereals, root crops, non-permanent industrial crops, grassland, and woodland and shrubland classes had appropriate discrimination compared with other categories by having an F1-score of 83.6%, 77.9%, 75.4%, 79.8%, and 90.4%, respectively. Besides, the dry pulses, fodder crops, and bare arable land classes had lower accuracy due to their confusion with cereals, grassland, and woodland and shrubland classes.

In addition, maize, sugar beet, sunflowers, and rape and turnip rape had suitable discrimination between crops, as reported in Table A1. The misclassification proportion was also reported as relatively high in cereals. The KC for main (21 classed) and grouped (8 classed) schemes of classification was 0.70 and 0.75, respectively. Thus, according to Table 8, there is a substantial agreement between classified and true data in the generated model.

4. Discussion

This work highlights the high potential of multi-temporal S2 data for large-scale land-cover classification focusing on field crops. Based on monthly composites of the summer months and yearly image composites, 19 field crops, one forest class, and grassland were classified with consistently good results. Both overall and class-specific accuracies are comparable to a recently published study using the same training and validation data but S1 data for the modeling. Andrimont et al. [16] achieved an overall accuracy in discriminating the 21 vegetation classes of 74.0% and if they were grouped into the main crop type groups of 79.2%. In this work, the 21 classes could be separated with an accuracy of 77.6% and the eight classes with 82.5%, outperforming the S1 classification performance by 5.2 and 3.2 percentage points, respectively.

Both the training and the validation datasets are affected by an unbalanced number of samples and this is reflected in the large differences in the class-specific accuracies achieved. Classes with low presence in the field are also significantly less represented in the training data and subsequently often lead to lower accuracies. The grouping of classes could alleviate the problem, but also here, the class-specific accuracies range between 30% and 90% (Table 10). Compared to the study of Andrimont et al. [16], the results of the present work are not dramatically better but somewhat higher and more balanced across the classes.

Due to cloud cover, optical satellite data such as Landsat and S2 are often not available over large areas and at regular acquisition intervals. This is not a problem for producing actual crop classification for smaller areas [8] or for analyzing temporally more “stable” classes such as non-vegetation and forests, since scenes from several years can be combined in such cases [45]. Another alternative is the aggregation to seasonal or annual composites [21], whereby even data from several years have to be combined [19] and phenological differences are lost. Another possibility is the smoothing and gap-filling of multi-temporal data [46].

The present work aimed to analyze the potential of monthly and yearly S2 composites. However, it turned out that despite the very good temporal repetition rate of the two S2 satellites, the acquisitions, at least for 2018, were insufficient to generate cloud-free/snow-free monthly composites for the entire Europe. Naturally, this leads to missing data, especially in the winter months. However, also, in the 6 months used for this study, spectral values were not available for all reference data, which also hampers an area-wide application of the model. Best classification results would be achieved with S2 data acquired in the period April-September period. Unfortunately, such a feature set provides only 75% spatial coverage. For geographic regions with data acquisition problems (due to extensive cloud coverage), narrower temporal windows must be used. For example, the monthly indicators between May and August allow coverage of 82% of the EU-28, with a drop in the OA of 1 percentage point compared to the best model. The use of the yearly indicators allows coverage of 97% of the area, with a drop in the accuracy of 4 percentage points. For this purpose, several models with different input data would have to be combined. This underlines the complementary potential of weather-independent microwave data such as S1. This could already be shown with less detailed classes by Venter and Sydenham [21], who could increase the overall accuracy by 3 percentage points by combining S1 and S2. Similar results were obtained by Inglada et al. [47]. Their study also found that the use of S2 outperformed that of S1 alone, which can be confirmed with the present study compared with the results of d’Andrimont et al. [22] based on S1 data.

In addition to high-quality remote-sensing data, reference data are crucial for successful classification. In this respect, both quality and quantity play a decisive role, and a good spatial distribution of the data is also important. Here, the European LUCAS data provide a good dataset that has already been used in several recent papers for large-scale applications [18,19,20,22]. As already mentioned, these studies used seasonal composites of optical data and, therefore, the focus was on broader and more general land-cover classes. However, in addition to continental or global land cover classifications, the current distribution of crops is of interest, and the LUCAS 2018 data represent an excellent dataset. However, using this data, one major problem is related to the big differences in the number of samples per class. This study, therefore, investigated whether it is advantageous to use only different subsamples of the total available LUCAS data. The achieved accuracies using the independent validation data showed only very small differences between the approaches. However, the use of spectrally different pixels from a reference polygon reflects the class distributions best and gave practically the same results as using all pixels but using only 11% of the samples. The use of a single pixel per polygon (e.g., a proportional sampling), on the other hand, provided the lowest accuracies. Not surprisingly, this subset gave the most realistic RF OOB score (Table 9) while using all data for training provided a positively bias RF OOB score (compared to the OA obtained with independent data). This confirms the high significance of the RF OOB score as shown in previous literature provided that the training data are largely independent [48,49,50,51]. Attempting to create a more balanced training dataset mainly for the classes with larger numbers of training samples to address the issue of unbalanced datasets [45,50,52,53,54], on the other hand, did not lead to any significant improvement. Also, the influence of the classification approach used was minor, with SVM performing only slightly better than RF, confirming the results of other studies [50,55].

5. Conclusions

Earth observation (EO) data, such as the Copernicus Sentinel-1 (S1) and Sentinel-2 (S2) data, are ideal for repetitive, cost-efficient crop type mapping over wide-areas. However, extensive field data are necessary for training and testing the classification algorithms and validating the maps.

In this study, we explored the potential of the Land Use and Coverage Area frame Survey (LUCAS) field data to provide crop types maps in combination with S2 observations for the year 2018. Classification results are very encouraging (achieving an overall accuracy of 77.6% with 19 crop types, plus grassland, and woodland and shrubland, across Europe) and show the potential of both LUCAS field data and S2 satellite data. However, there are several open questions, the most relevant being related to how to obtain and validate crop type maps in years when LUCAS data are not available. The community is looking forward to the LUCAS 2022 survey data to further test the approach across multiple years in combination with S1 and S2 satellite observations. There is also an urgent need to develop and validate transfer learning approaches that will allow reinforced learning based on multiple years of data. As a result, an improved temporal continuity in the mapping capacity and a more cost-effective use of the field data are expected.

Author Contributions

Conceptualization, F.V.; methodology, F.V., B.G. and M.I.; software, B.G. and M.Ż.; validation, B.G. and M.I.; formal analysis, B.G. and M.I.; investigation, F.V.; B.G. and A.D.; resources, F.V.; data curation, B.G. and M.Ż.; writing—original draft preparation, all authors; writing—review and editing, all authors; supervision, F.V., M.I. and C.A.; project administration, F.V.; funding acquisition, F.V. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the European Union’s Horizon 2020 Framework Programme for Research and Innovation under grant agreement No. 774234 (Landsupport) and No. 818346 (SIEUSOIL). The APC was funded by Landsupport.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset d’Andrimont, Raphael (2020): LUCAS 2018 Copernicus was analyzed in this study. The data are openly available in FigShare at https://doi.org/10.6084/m9.figshare.12382667.v3 accessed on 13 December 2021. The dataset d’Andrimont, Raphael; yordanov, momchil; Martinez-Sanchez,, Laura; Eiselt, Beatrice; Palmieri, Alessandra; Dominici, Paolo; et al. (2020): Harmonised LUCAS in-situ land cover and use database for field surveys from 2006 to 2018 in the European Union was analyzed in this study. The data are openly available in FigShare at https://doi.org/10.6084/m9.figshare.9962765.v2 accessed on 13 December 2021.

Acknowledgments

This research was motivated by the need of detailed crop type maps that support different research activities and applications. For example, in the Horizon 2020 (H2020) Landsupport project (https://www.landsupport.eu, accesed on 13 December 2021), the development was driven by (1) the need of crop type maps for management, modelling and scenario analysis for best practices in agriculture at regional level, and (2) the need for spatially and thematically consistent European-wide land cover maps for the application of policy-related tools. In the context of the H2020 SIEUSOIL (https://www.sieusoil.eu, accesed on 13 December 2021), crop types, intensity and rotations are key indicators to design optimal soil management practices. A more farmer-oriented application is being developed in the Austrian FFG ARmEO project, where maps are needed to run a benchmarking tool that will allow farmers to compare the performance of their crops against same crop types growing on similar soils in the neighboring regions.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

Table A1. The confusion matrix extracted from the SVM classification result on validation data. UA = user accuracy, PA = producer’s accuracy, OA = overall accuracy.

Code	211	212	213	214	215	216	217	218	219	221	222	223	230	231	232	233	240	250	290	300	500	Total	UA	F1-Score
211	3855	326	839	309	212	74	1	280	16	18	17	5	47	19	123	1	66	110	513	99	505	7435	51.8%	62.9%
212	51	224	37	1	19	1	0	3	0	0	0	1	1	1	1	0	10	11	18	5	25	409	54.8%	34.4%
213	388	112	1430	81	164	20	0	43	9	6	4	9	14	1	56	0	58	93	166	29	219	2902	49.3%	49.4%
214	41	3	28	135	20	1	0	59	1	0	0	0	1	0	11	0	1	12	13	2	25	353	38.2%	26.5%
215	17	2	36	9	54	0	0	2	0	0	0	0	1	0	1	0	4	12	9	2	25	174	31.0%	12.3%
216	45	4	31	10	11	3058	14	5	30	36	12	8	7	22	8	36	23	34	67	65	183	3709	82.4%	85.4%
217	0	0	0	0	0	0	19	0	0	0	0	0	0	0	0	0	0	0	0	0	0	19	100.0%	54.3%
218	0	0	1	1	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	0	2	0.0%	0.0%
219	0	0	0	0	0	0	0	0	8	0	0	0	0	0	0	0	0	0	0	0	1	9	88.9%	11.8%
221	0	0	4	1	0	9	0	1	0	230	6	0	7	4	0	8	10	5	6	0	6	297	77.4%	68.7%
222	2	0	7	0	0	4	0	0	7	8	507	12	5	7	3	0	14	1	3	2	10	592	85.6%	84.4%
223	0	0	0	0	0	0	2	0	0	0	1	5	1	0	0	0	0	0	0	0	0	9	55.6%	9.8%
230	0	1	0	0	1	2	2	0	0	0	0	0	88	0	2	1	11	1	4	0	2	115	76.5%	44.3%
231	3	2	2	0	3	15	0	0	2	14	2	5	22	431	3	5	38	7	22	4	52	632	68.2%	73.1%
232	12	1	18	3	1	8	0	4	4	3	7	3	10	1	1105	3	20	4	100	8	62	1377	80.2%	78.4%
233	0	0	2	0	0	4	0	0	0	1	0	0	0	1	1	63	1	1	1	0	2	77	81.8%	60.6%
240	10	7	22	3	7	14	4	1	6	25	21	21	21	17	24	1	203	31	56	4	53	551	36.8%	33.8%
250	5	6	8	3	11	7	0	0	2	4	1	2	4	1	1	3	19	405	11	9	71	573	70.7%	34.3%
290	23	21	46	1	14	4	2	1	1	3	3	3	1	1	14	0	19	6	347	14	118	642	54.0%	26.8%
300	72	27	63	11	27	61	2	5	6	2	9	1	12	9	11	2	40	95	179	25,321	2715	28,670	88.3%	90.4%
500	304	158	315	96	162	167	5	35	35	23	20	18	40	33	79	8	112	963	428	1775	17,477	22,253	78.5%	79.8%
Total	4828	894	2889	664	706	3449	51	439	127	373	610	93	282	548	1443	131	649	1791	1943	27,339	21,551	70,800	OA = 77.6%
PA	79.8%	25.1%	49.5%	20.3%	7.6%	88.7%	37.3%	0.0%	6.3%	61.7%	83.1%	5.4%	31.2%	78.6%	76.6%	48.1%	31.3%	22.6%	17.9%	92.6%	81.1%		OA = 77.6%

References

Bonan, G.B. Forests and climate change: Forcings, feedbacks, and the climate benefits of forests. Science 2008, 320, 1444–1449. [Google Scholar] [CrossRef] [Green Version]
Brovkin, V.; Sitch, S.; von Bloh, W.; Claussen, M.; Bauer, E.; Cramer, W. Role of land cover changes for atmospheric CO₂ increase and climate change during the last 150 years. Glob. Chang. Biol. 2004, 10, 1253–1266. [Google Scholar] [CrossRef] [Green Version]
Foley, J.A.; Defries, R.; Asner, G.P.; Barford, C.; Bonan, G.; Carpenter, S.R.; Chapin, F.S.; Coe, M.T.; Daily, G.C.; Gibbs, H.K.; et al. Global consequences of land use. Science 2005, 309, 570–574. [Google Scholar] [CrossRef] [Green Version]
Malinowski, R.; Lewiński, S.; Rybicki, M.; Gromny, E.; Jenerowicz, M.; Krupiński, M.; Nowakowski, A.; Wojtkowski, C.; Krupiński, M.; Krätzschmar, E.; et al. Automated Production of a Land Cover/Use Map of Europe Based on Sentinel-2 Imagery. Remote Sens. 2020, 12, 3523. [Google Scholar] [CrossRef]
Topaloğlu, R.H.; Sertel, E.; Musaoğlu, N. Assessment of Classification Accuracies of Sentinel-2 and Landsat-8 Data for Land Cover/Use Mapping. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, XLI-B8, 1055–1059. [Google Scholar] [CrossRef] [Green Version]
Khatami, R.; Mountrakis, G.; Stehman, S.V. A meta-analysis of remote sensing research on supervised pixel-based land-cover image classification processes: General guidelines for practitioners and future research. Remote Sens. Environ. 2016, 177, 89–100. [Google Scholar] [CrossRef] [Green Version]
Hansen, M.C.; Loveland, T.R. A review of large area monitoring of land cover change using Landsat data. Remote Sens. Environ. 2012, 122, 66–74. [Google Scholar] [CrossRef]
Vuolo, F.; Neuwirth, M.; Immitzer, M.; Atzberger, C.; Ng, W.-T. How much does multi-temporal Sentinel-2 data improve crop type classification? Int. J. Appl. Earth Obs. Geoinf. 2018, 72, 122–130. [Google Scholar] [CrossRef]
Xiong, J.; Thenkabail, P.; Tilton, J.; Gumma, M.; Teluguntla, P.; Oliphant, A.; Congalton, R.; Yadav, K.; Gorelick, N. Nominal 30-m Cropland Extent Map of Continental Africa by Integrating Pixel-Based and Object-Based Algorithms Using Sentinel-2 and Landsat-8 Data on Google Earth Engine. Remote Sens. 2017, 9, 1065. [Google Scholar] [CrossRef] [Green Version]
Inglada, J.; Vincent, A.; Arias, M.; Tardy, B.; Morin, D.; Rodes, I. Operational High Resolution Land Cover Map Production at the Country Scale Using Satellite Image Time Series. Remote Sens. 2017, 9, 95. [Google Scholar] [CrossRef] [Green Version]
Dostálová, A.; Lang, M.; Ivanovs, J.; Waser, L.T.; Wagner, W. European Wide Forest Classification Based on Sentinel-1 Data. Remote Sens. 2021, 13, 337. [Google Scholar] [CrossRef]
Karra, K.; Kontgis, C.; Statman-Weil, Z.; Mazzariello, J.C.; Mathis, M.; Brumby, S.P. Global land use/land cover with Sentinel 2 and deep learning. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS. IGARSS 2021, Brussels, Belgium, 11–16 July 2021; pp. 4704–4707. [Google Scholar]
Zanaga, D.; van de Kerchove, R.; de Keersmaecker, W.; Souverijns, N.; Brockmann, C.; Quast, R.; Wevers, J.; Grosu, A.; Paccini, A.; Vergnaud, S.; et al. ESA WorldCover 10 m 2020 v100. 2021. Available online: https://doi.org/10.5281/zenodo.5571935 (accessed on 13 December 2021).
Defourny, P.; Bontemps, S.; Bellemans, N.; Cara, C.; Dedieu, G.; Guzzonato, E.; Hagolle, O.; Inglada, J.; Nicola, L.; Rabaute, T.; et al. Near real-time agriculture monitoring at national scale at parcel resolution: Performance assessment of the Sen2-Agri automated system in various cropping systems around the world. Remote Sens. Environ. 2019, 221, 551–568. [Google Scholar] [CrossRef]
Jiang, Y.; Lu, Z.; Li, S.; Lei, Y.; Chu, Q.; Yin, X.; Chen, F. Large-Scale and High-Resolution Crop Mapping in China Using Sentinel-2 Satellite Imagery. Agriculture 2020, 10, 433. [Google Scholar] [CrossRef]
d’Andrimont, R.; Yordanov, M.; Martinez-Sanchez, L.; Eiselt, B.; Palmieri, A.; Dominici, P.; Gallego, J.; Reuter, H.I.; Joebges, C.; Lemoine, G.; et al. Harmonised LUCAS in-situ land cover and use database for field surveys from 2006 to 2018 in the European Union. Sci. Data 2020, 7, 352. [Google Scholar] [CrossRef] [PubMed]
d’Andrimont, R.; Verhegghen, A.; Meroni, M.; Lemoine, G.; Strobl, P.; Eiselt, B.; Yordanov, M.; Martinez-Sanchez, L.; van der Velde, M. LUCAS Copernicus 2018: Earth-observation-relevant in situ data on land cover and use throughout the European Union. Earth Syst. Sci. Data 2021, 13, 1119–1133. [Google Scholar] [CrossRef]
Close, O.; Benjamin, B.; Petit, S.; Fripiat, X.; Hallot, E. Use of Sentinel-2 and LUCAS Database for the Inventory of Land Use, Land Use Change, and Forestry in Wallonia, Belgium. Land 2018, 7, 154. [Google Scholar] [CrossRef] [Green Version]
Pflugmacher, D.; Rabe, A.; Peters, M.; Hostert, P. Mapping pan-European land cover using Landsat spectral-temporal metrics and the European LUCAS survey. Remote Sens. Environ. 2019, 221, 583–595. [Google Scholar] [CrossRef]
Weigand, M.; Staab, J.; Wurm, M.; Taubenböck, H. Spatial and semantic effects of LUCAS samples on fully automated land use/land cover classification in high-resolution Sentinel-2 data. Int. J. Appl. Earth Obs. Geoinf. 2020, 88, 102065. [Google Scholar] [CrossRef]
Venter, Z.S.; Sydenham, M.A.K. Continental-Scale Land Cover Mapping at 10 m Resolution Over Europe (ELC10). Remote Sens. 2021, 13, 2301. [Google Scholar] [CrossRef]
d’Andrimont, R.; Verhegghen, A.; Lemoine, G.; Kempeneers, P.; Meroni, M.; van der Velde, M. From parcel to continental scale –A first European crop type map based on Sentinel-1 and LUCAS Copernicus in-situ observations. Remote Sens. Environ. 2021, 266, 112708. [Google Scholar] [CrossRef]
Lu, D.; Weng, Q. A survey of image classification methods and techniques for improving classification performance. Int. J. Remote Sens. 2007, 28, 823–870. [Google Scholar] [CrossRef]
Thanh Noi, P.; Kappas, M. Comparison of Random Forest, k-Nearest Neighbor, and Support Vector Machine Classifiers for Land Cover Classification Using Sentinel-2 Imagery. Sensors 2017, 18, 18. [Google Scholar] [CrossRef] [Green Version]
Vuolo, F.; Żółtak, M.; Pipitone, C.; Zappa, L.; Wenng, H.; Immitzer, M.; Weiss, M.; Baret, F.; Atzberger, C. Data Service Platform for Sentinel-2 Surface Reflectance and Value-Added Products: System Use and Examples. Remote Sens. 2016, 8, 938. [Google Scholar] [CrossRef] [Green Version]
Bouhennache, R.; Bouden, T.; Taleb-Ahmed, A.; Cheddad, A. A new spectral index for the extraction of built-up land features from Landsat 8 satellite imagery. Geocarto Int. 2019, 34, 1531–1551. [Google Scholar] [CrossRef]
Rikimaru, A.; Roy, P.S.; Miyatake, S. Tropical forest cover density mapping. Trop. Ecol. 2002, 43, 39–47. [Google Scholar]
Xu, H. Modification of normalised difference water index (NDWI) to enhance open water features in remotely sensed imagery. Int. J. Remote Sens. 2006, 27, 3025–3033. [Google Scholar] [CrossRef]
Zha, Y.; Gao, J.; Ni, S. Use of normalized difference built-up index in automatically mapping urban areas from TM imagery. Int. J. Remote Sens. 2003, 24, 583–594. [Google Scholar] [CrossRef]
van Deventer, A.P.; Ward, A.D.; Gowda, P.H.; Lyon, J.G. Using thematic mapper data to identify contrasting soil plains and tillage practices. Photogramm. Eng. Remote Sens. 1997, 63, 87–93. [Google Scholar]
Kriegler, F.J.; Malila, W.A.; Nalepka, R.F.; Richardson, W. Preprocessing transformations and their effects on multispectral recognition. Remote Sens. Environ. 1969, VI, 97–132. [Google Scholar]
Fick, S.E.; Hijmans, R.J. WorldClim 2: New 1-km spatial resolution climate surfaces for global land areas. Int. J. Clim. 2017, 37, 4302–4315. [Google Scholar] [CrossRef]
Weiss, M.; Baret, F. S2ToolBox Level 2 products: LAI, FAPAR, FCOVER, Version 1.1. In ESA Contract nr 4000110612/14/I-BG (p. 52); INRA: Avignon, France, 2016. [Google Scholar]
d’Andrimont, R.; Verhegghen, A.; Meroni, M.; Lemoine, G.; Strobl, P.; Eiselt, B.; Yordanov, M.; Martinez-Sanchez, L.; van der Velde, M. LUCAS Copernicus 2018: Earth Observation relevant insitu data on land cover throughout the European Union. Earth Syst. Sci. Data Discuss. 2020, 2020, 1–19. [Google Scholar] [CrossRef]
Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
Immitzer, M.; Atzberger, C.; Koukal, T. Tree Species Classification with Random Forest Using Very High Spatial Resolution 8-Band WorldView-2 Satellite Data. Remote Sens. 2012, 4, 2661–2693. [Google Scholar] [CrossRef] [Green Version]
Cortes, C.; Vapnik, V. Support-vector networks. Mach. Learn. 1995, 20, 273–297. [Google Scholar] [CrossRef]
Vapnik, V. The Nature of Statistical Learning Theory; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1999; ISBN 9780387987804. [Google Scholar]
Sheykhmousa, M.; Mahdianpari, M.; Ghanbari, H.; Mohammadimanesh, F.; Ghamisi, P.; Homayouni, S. Support Vector Machine vs. Random Forest for Remote Sensing Image Classification: A Meta-Analysis and Systematic Review. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 6308–6325. [Google Scholar]
Schultz, B.; Immitzer, M.; Formaggio, A.; Sanches, I.; Luiz, A.; Atzberger, C. Self-Guided Segmentation and Classification of Multi-Temporal Landsat 8 Images for Crop Type Mapping in Southeastern Brazil. Remote Sens. 2015, 7, 14482–14508. [Google Scholar] [CrossRef] [Green Version]
Elgeldawi, E.; Sayed, A.; Galal, A.R.; Zaki, A.M. Hyperparameter Tuning for Machine Learning Algorithms Used for Arabic Sentiment Analysis. Informatics 2021, 8, 79. [Google Scholar] [CrossRef]
Bergstra, J.; Bengio, Y. Random Search for Hyper-Parameter Optimization. J. Mach. Learn. Res. 2012, 13, 281–305. [Google Scholar]
Cohen, J. A Coefficient of Agreement for Nominal Scales. Educ. Psychol. Meas. 1960, 20, 37–46. [Google Scholar] [CrossRef]
Landis, J.R.; Koch, G.G. A One-Way Components of Variance Model for Categorical Data. Biometrics 1977, 33, 671. [Google Scholar] [CrossRef]
Immitzer, M.; Neuwirth, M.; Böck, S.; Brenner, H.; Vuolo, F.; Atzberger, C. Optimal Input Features for Tree Species Classification in Central Europe Based on Multi-Temporal Sentinel-2 Data. Remote Sens. 2019, 11, 2599. [Google Scholar] [CrossRef] [Green Version]
Vuolo, F.; Ng, W.-T.; Atzberger, C. Smoothing and gap-filling of high resolution multispectral time series: Example of Landsat data. Int. J. Appl. Earth Obs. Geoinf. 2017, 57, 202–213. [Google Scholar] [CrossRef]
Inglada, J.; Vincent, A.; Arias, M.; Marais-Sicre, C. Improved Early Crop Type Identification By Joint Use of High Temporal Resolution SAR And Optical Image Time Series. Remote Sens. 2016, 8, 362. [Google Scholar] [CrossRef] [Green Version]
Belgiu, M.; Guţ, L.; Strobl, J. Quantitative evaluation of variations in rule-based classifications of land cover in urban neighbourhoods using WorldView-2 imagery. ISPRS J. Photogramm. Remote Sens. 2014, 87, 205–215. [Google Scholar] [CrossRef] [Green Version]
Millard, K.; Richardson, M. On the Importance of Training Data Sample Selection in Random Forest Image Classification: A Case Study in Peatland Ecosystem Mapping. Remote Sens. 2015, 7, 8489–8515. [Google Scholar] [CrossRef] [Green Version]
Ørka, H.O.; Dalponte, M.; Gobakken, T.; Næsset, E.; Ene, L.T. Characterizing forest species composition using multiple remote sensing data sources and inventory approaches. Scand. J. For. Res. 2013, 28, 677–688. [Google Scholar] [CrossRef]
Toscani, P.; Immitzer, M.; Atzberger, C. Wavelet-based texture measures for object-based classification of aerial images. PFG Photogramm. Fernerkund. Geoinf. 2013, 2013, 105–121. [Google Scholar] [CrossRef]
Colditz, R. An Evaluation of Different Training Sample Allocation Schemes for Discrete and Continuous Land Cover Classification Using Decision Tree-Based Algorithms. Remote Sens. 2015, 7, 9655–9681. [Google Scholar] [CrossRef] [Green Version]
Mellor, A.; Boukir, S.; Haywood, A.; Jones, S. Exploring issues of training data imbalance and mislabelling on random forest performance for large area land cover classification using the ensemble margin. ISPRS J. Photogramm. Remote Sens. 2015, 105, 155–168. [Google Scholar] [CrossRef]
Immitzer, M.; Böck, S.; Einzmann, K.; Vuolo, F.; Pinnel, N.; Wallner, A.; Atzberger, C. Fractional cover mapping of spruce and pine at 1 ha resolution combining very high and medium spatial resolution satellite imagery. Remote Sens. Environ. 2018, 204, 690–703. [Google Scholar] [CrossRef] [Green Version]
Ghosh, A.; Joshi, P.K. A comparison of selected classification algorithms for mapping bamboo patches in lower Gangetic plains using very high resolution WorldView 2 imagery. Int. J. Appl. Earth Obs. Geoinf. 2014, 26, 298–311. [Google Scholar] [CrossRef]

Figure 1. The general steps for the classification approach.

Figure 2. Distribution of all European Land Use and Cover Area frame Survey (LUCAS) 2018 polygons over European Union (EU-28) countries.

Figure 3. LUCAS 2018-the schema for data collection.

Figure 4. Final land-use land-cover (LULC) map generated over the EU-28 regions (coverage 97% of the land area). The figure combines a map obtained with the best model (43 features, OA = 77.6%) and an additional model based on yearly indicators only (21 features, OA = 72.5%) used to fill remaining gaps (15% of the land area).

Table 1. The spectral bands and indices, bio-geophysical indicators, climatic variables, and phenological indicators used for classification.

Feature Name	Description		Counts
Bio-geophysical Indicators	LAI, FAPAR, FCOVER		12 monthly composites
Spectral Bands	B2: Blue	B7: Red Edge 3
	B3: Green	B8: NIR
	B4: Red	B8A: NIR narrow
	B5: Red Edge 1	B11: SWIR 1
	B6: Red Edge 2	B12: SWIR 2
Climatic variables	RAIN: Mean rainfall value during the year 2018		1 yearly value
Climatic variables	TEMP: Mean temperature value during the year 2018
Phenological variables	Day of maximum NDVI
Spectral Indices	BLFEI: $(((B 3 + B 4 + B 12) / 3) - B 11) / (((B 3 + B 4 + B 12) / 3) + B 11)$		12 monthly composites and 3 yearly centiles
	BSI: $((B 11 + B 4) - (B 8 + B 2)) / ((B 11 + B 4) + (B 8 + B 2))$
	MNDWI: $((B 3 - B 11)) / ((B 3 + B 11))$
	NDBI: $((B 11 - B 8)) / ((B 11 + B 8))$
	NDTI: $((B 11 - B 12)) / ((B 11 + B 12))$
	NDVI: $((B 8 - B 4)) / ((B 8 + B 4))$

Abbreviations: LAI = leaf area index, FAPAR = fraction of absorbed photosynthetically active radiation, FCOVER = fraction of green vegetation cover, NIR = near infrared, SWIR = shortwave infrared, NDVI = normalized difference vegetation index, NDTI = normalized difference tillage index, MNDWI = modified normalized difference water index, NDBI = normalized difference built-up index, BSI = bare soil index, BLFEI = built-up land features extraction index.

Table 2. Distribution of LUCAS 2018 survey data [34].

	LUCAS 2018 Points	LUCAS Copernicus Points Training Data	LUCAS Core Points Validation Data
In-situ	238,014	63,364	174,650
Office photo-interpreted	99,803	-	99,803
Others	37	-	37
Total	337,854	63,364	274,490
		LUCAS Copernicus polygons
Remaining after exclusion of points due to geolocation problems in the construction of polygons		63,287
Remaining after exclusion of points due to missing level-3 information		58,428
Remaining after exclusion of points due to missing S2 data		56,366 (1,901,627 pixels)
Remaining after exclusion of points due to missing S2 features in winter months		43,013 (1,349,052 pixels)
Remaining after exclusion of points not covered by one of the 21 LC classes assessed in this study		42,753 (1,344,885 pixels)

Table 3. Number and percentage of pixels with missing values for each monthly compositing period–these numbers refer to the 56,366 Copernicus polygons (1,901,627 pixels) reported in Table 2.

Month (2018)	Dec.	Jan.	Feb.	Mar.	Nov.	Apr.	Sep.	Aug.	May.	Jun.	Jul.	Oct.
Missing values	1,191,638	1,156,586	994,954	883,140	748,819	167,270	99,414	86,544	86,029	83,494	79,334	73,261
% of missing values	63%	61%	52%	46%	39%	9%	5%	5%	5%	4%	4%	4%

Table 4. The classification scheme used to produce the EU map with 19 crop types plus two broad categories with Woodland & Shrubland and Grassland classes. The “Main Class Name”, with the respective class codes (“Code”), was used in this study.

Grouped Class Name	Code	Main Class Name	Class Descriptors in LUCAS Level-3 Land Cover
Cereals	211	Common wheat	B11-Common wheat
	212	Durum wheat	B12-Durum wheat
	213	Barley	B13-Barley
	214	Rye	B14-Rye
	215	Oats	B15-Oats
	216	Maize	B16-Maize
	217	Rice	B17-Rice
	218	Triticale	B18-Triticale
	219	Other cereals	B19-Other cereals
Root crops	221	Potatoes	B21-Potatoes
	222	Sugar beet	B22-Sugar beet
	223	Other root crops	B23-Other root crops
Non-permanent industrial crops	230	Other non-permanent industrial crops	B34-Cotton
			B35-Other fibre and oleaginous crops
			B36-Tobacco
			B37-Other non-permanent industrial crops
	231	Sunflower	B31-Sunflower
	232	Rape and turnip rape	B32-Rape and turnip rape
	233	Soya	B33-Soya
Dry pulses, vegetables, and flowers	240	Dry pulses, vegetables, and flowers	B41-Dry pulses
			B42-Tomatoes
			B43-Other fresh vegetables
			B44-Floriculture and ornamental plants
			B45-Strawberries
Fodder crops	250	Fodder crops	B51-Clovers
			B52-Lucerne
			B53-Other leguminous and mixtures for fodder
			B54-Mixed cereals for fodder
Bare arable land	290	Bare arable land	F40-Other bare soil (only with U111/112/113 Land use)
Woodland and shrubland	300	Woodland and shrubland	B71-Apple fruit
			B72-Pear fruit
			B73-Cherry fruit
			B74-Nuts trees
			B75-Other fruit trees and berries
			B76-Oranges
			B77-Other citrus fruit
			B81-Olive groves
			B82-Vineyards
			B83-Nurseries
			B84-Permanent industrial crops
			C10-Broadleaved woodland
			C21-Spruce dominated coniferous woodland
			C22-Pine dominated coniferous woodland
			C23-Other coniferous woodland
			C31-Spruce dominated mixed woodland
			C32-Pine dominated mixed woodland
			C33-Other mixed woodland
			D10-Shrubland with sparse tree cover
			D20-Shrubland without tree cover
Grassland	500	Grassland	B55-Temporary grasslands
			E10-Grassland with sparse tree/shrub cover
			E20-Grassland without tree/shrub cover
			E30-Spontaneously vegetated surfaces

Table 5. Available training samples for each of the three different subsets.

Class Name	All Data	Proportional	Balanced	Dissimilar
Woodland and shrubland	562,564	16,708	4000	62,812
Grassland	338,977	12,300	4000	39,149
Common wheat	132,878	3749	4000	14,757
Maize	63,788	1893	4000	7102
Barley	54,881	1830	4000	6250
Fodder crops	28,882	997	4000	3316
Rape and turnip rape	30,388	887	4000	3393
Bare arable land	19,678	777	4000	2336
Sunflower	17,001	520	4000	1913
Dry pulses, vegetables, and flowers	14,241	487	4000	1619
Rye	16,934	483	4000	1885
Oats	13,778	474	4000	1579
Durum wheat	11,236	381	4000	1284
Sugar beet	10,752	327	4000	1209
Triticale	9573	284	4000	1070
Potatoes	6827	227	4000	777
Other non-permanent industrial crops	5363	178	4000	612
Soya	3234	116	3234	371
Other cereals	2187	71	2187	247
Other roots crops	1647	61	1647	189
Rice	76	3	76	9
Sum	1,344,885	42,753	75,144	151,879

Table 6. Hyperparameters tuning process for the random forest (RF) classifier.

Parameters	Value Range in Random Search	Value Range in Grid Search	Tuned Values
Number of possible permutations	1000	56
Number of assessed permutations	100	56
n_estimators	[200:200:2000]	[600:100:1900]	1100
max_features	[‘sqrt’, ‘log2’]	[‘sqrt’]	[‘sqrt’]
min_samples_split	[1:1:10]	[2:1:5]	3
min_samples_leaf	[1:1:5]	1	1

Table 7. Hyperparameters tuning process for support vector machine (SVM) classifier. Auto = 1/Number of Features.

Parameters	Value Range in Random Search	Value Range in Grid Search	Tuned Values
Number of possible permutations	1800	30
Number of assessed permutations	180	30
C	[1:1:100]	[2:1:6]	3
kernel	[‘poly’, ‘rbf’, ‘sigmoid’]	[‘rbf’]	[‘rbf’]
degree	[2:1:4]	[2:1:4]	2
gamma	[‘scale’, ‘auto’]	[‘scale’, ‘auto’]	[‘auto’]

Table 8. Rating criteria of kappa coefficient (KC).

KC	<0.00	0.00–0.20	0.21–0.4	0.41–0.60	0.61–0.80	0.81–1.00
Strength of agreement	Poor	Slight	Fair	Moderate	Substantial	Almost perfect

Table 9. The results of implementing two classifiers on different subsets of training data. OA = overall accuracy evaluated on the validation data; OOB = out of bag score (available for RF only).

Sampling Method	Number of Samples	OOB for RF	OA for RF	OA for SVM
Proportional	42,753	77.5%	75.7%	76.3%
Balanced	75,144	91.4%	72.6%	71.2%
Dissimilar	151,879	86.8%	76.2%	77.6%
All data	1,344,885	98.2%	76.8%	77.8%

Table 10. Confusion matrix for grouped classes by applying trained model (with dissimilar dataset and SVM classifier) on the validation dataset. UA = user accuracy, PA = producer’s accuracy, OA = overall accuracy.

Comprehensive Class	Code	210	220	230	240	250	290	300	500	Total	UA	F1-score
Cereals	210	12,140	116	351	162	272	786	202	983	15,012	80.9%	83.6%
Root crops	220	37	769	35	24	6	9	2	16	898	85.6%	77.9%
Non permanent industrial crops	230	90	35	1736	70	13	127	12	118	2201	78.9%	75.4%
Dry pulses, vegetables and flowers	240	74	67	63	203	31	56	4	53	551	36.8%	33.8%
Fodder crops	250	42	7	9	19	405	11	9	71	573	70.7%	34.3%
Bare arable land	290	113	9	16	19	6	347	14	118	642	54.0%	26.8%
Woodland & shrubland	300	274	12	34	40	95	179	25,321	2715	28,670	88.3%	90.4%
Grassland	500	1277	61	160	112	963	428	1775	17,477	22,253	78.5%	79.8%
Total		14,047	1076	2404	649	1791	1943	27,339	21,551	70,800	OA = 82.5%
PA		86.4%	71.5%	72.2%	31.3%	22.6%	17.9%	92.6%	81.1%		OA = 82.5%

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Ghassemi, B.; Dujakovic, A.; Żółtak, M.; Immitzer, M.; Atzberger, C.; Vuolo, F. Designing a European-Wide Crop Type Mapping Approach Based on Machine Learning Algorithms Using LUCAS Field Survey and Sentinel-2 Data. Remote Sens. 2022, 14, 541. https://doi.org/10.3390/rs14030541

AMA Style

Ghassemi B, Dujakovic A, Żółtak M, Immitzer M, Atzberger C, Vuolo F. Designing a European-Wide Crop Type Mapping Approach Based on Machine Learning Algorithms Using LUCAS Field Survey and Sentinel-2 Data. Remote Sensing. 2022; 14(3):541. https://doi.org/10.3390/rs14030541

Chicago/Turabian Style

Ghassemi, Babak, Aleksandar Dujakovic, Mateusz Żółtak, Markus Immitzer, Clement Atzberger, and Francesco Vuolo. 2022. "Designing a European-Wide Crop Type Mapping Approach Based on Machine Learning Algorithms Using LUCAS Field Survey and Sentinel-2 Data" Remote Sensing 14, no. 3: 541. https://doi.org/10.3390/rs14030541

APA Style

Ghassemi, B., Dujakovic, A., Żółtak, M., Immitzer, M., Atzberger, C., & Vuolo, F. (2022). Designing a European-Wide Crop Type Mapping Approach Based on Machine Learning Algorithms Using LUCAS Field Survey and Sentinel-2 Data. Remote Sensing, 14(3), 541. https://doi.org/10.3390/rs14030541

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Designing a European-Wide Crop Type Mapping Approach Based on Machine Learning Algorithms Using LUCAS Field Survey and Sentinel-2 Data

Abstract

1. Introduction

2. Materials and Methods

2.1. Sentinel-2: Data Preparation, Spectral and Temporal Features

2.2. Preparation of the Training Data

2.2.1. European Land Use and Cover Area Frame Survey (LUCAS) 2018 Data

2.2.2. LUCAS 2018 Data Refinement Due to Gaps in Sentinel-2 Data

2.2.3. LUCAS 2018 Data Refinement Related to the Classification Scheme

2.3. Classification Process

2.3.1. Classification Methods

2.3.2. Training Data Sub-Setting

2.3.3. Feature Selection

2.3.4. Hyperparameter Tuning

2.4. Accuracy Assessment

2.4.1. Validation Data

2.4.2. Assessment Metrics

3. Results

3.1. Selecting the Best Classifier and Dataset

3.2. The Accuracy Assessment of the Best Classification Model

4. Discussion

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI