This study examines the potential of open geodata sets and multitemporal Landsat satellite data as the basis for the automated generation of land use and land cover (LU/LC) information at large scales. In total, six openly available pan-European geodata sets, i.e., CORINE, Natura 2000, Riparian Zones, Urban Atlas, OpenStreetMap, and LUCAS in combination with about 1500 Landsat-7/8 scenes were used to generate land use and land cover information for three large-scale focus regions in Europe using the TimeTools processing framework. This fully automated preprocessing chain integrates data acquisition, radiometric, atmospheric and topographic correction, spectral–temporal feature extraction, as well as supervised classification based on a random forest classifier. In addition to the evaluation of the six different geodata sets and their combinations for automated training data generation, aspects such as spatial sampling strategies, inter and intraclass homogeneity of training data, as well as the effects of additional features, such as topography and texture metrics are evaluated. In particular, the CORINE data set showed, with up to 70% overall accuracy, high potential as a source for deriving dominant LU/LC information with minimal manual effort. The intraclass homogeneity within the training data set was of central relevance for improving the quality of the results. The high potential of the proposed approach was corroborated through a comparison with two similar LU/LC data sets, i.e., GlobeLand30 and the Copernicus High Resolution Layers. While similar accuracy levels could be observed for the latter, for the former, accuracy was considerable lower by about 12–24%.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited