Mapping Cropping Practices on a National Scale Using Intra-Annual Landsat Time Series Binning

: Spatially explicit information on cropland use intensity is vital for monitoring land and water resource demands in agricultural systems. Cropping practices underlie substantial spatial and temporal variability, which can be captured through the analysis of image time series. Temporal binning helps to overcome limitations concerning operability and repeatability for mapping large areas and can improve the thematic detail and consistency of maps in agricultural systems. We here assessed the use of annual, quarterly, and eight-day temporal features for mapping ﬁve cropping practices on annual croplands across Turkey. We used 2403 atmospherically corrected and topographically normalized Landsat Collection 1 L1TP images of 2015 to compute quarterly best-pixel composites, quarterly and annual spectral-temporal metrics, as well as gap-ﬁlled eight-day time series of Tasseled Cap components. We tested 22 feature sets for binary cropland mapping, and subsequent discrimination of ﬁve cropping practices: Spring and winter cropping, summer cropping, semi-aquatic cropping, double cropping, and greenhouse cultivation. We evaluated area-adjusted accuracies and compared cropland area estimates at the province-level with ofﬁcial statistics. We achieved overall accuracies above 90%, when using either all quarterly features or the eight-day Tasseled Cap time series, indicating that temporal binning of intra-annual image time-series into multiple temporal features improves representations of cropping practices. Class accuracies of winter and spring, summer, and double cropping were robust, while omission errors for semi-aquatic cropping and greenhouse cultivation were high. Our mapped cropland extent was in good agreement with province-level statistics (r 2 = 0.85, RMSE = 7.2%). Our results indicate that 71.3% ( ± 2.3%) of Turkey’s annual croplands were cultivated during winter and spring, 15.8% ( ± 2.2%) during summer, while 8.5% ( ± 1.6%) were double-cropped, 4% ( ± 1.9%) were cultivated under semi-aquatic conditions, and 0.32% ( ± 0.2%) was greenhouse cultivation. Our study presents an open and readily available framework for detailed cropland mapping over large areas, which bears the potential to inform assessments of land use intensity, as well as land and water resource demands.

Turkey has been, and is, undergoing policy-driven agricultural intensification. In this context, land consolidation efforts increased productivity by reducing parcel fragmentation across the country [42], and the expansion of irrigation infrastructure fostered regional development through increased agricultural production [43]. On the one hand, these policies boosted crop production over decades, ranking Turkey amongst the world's leading producers of cereals and other industrial crops [44]. On the other hand, they also resulted in a diverse agricultural landscape, dispersed along gradients in climate and topography, with a strong inter-annual variability in management practices [45]. Furthermore, low irrigation water use efficiencies [44,45], as well as the limited availability of land and water resources require optimization of cropland management towards a higher resource use efficiency. In this context, spatially explicit data on cropping practices is of high value to understand past and current patterns and drivers of land use intensity and associated resource consumption.
Growing information needs demand for accessible and operational methodologies, and analysis-ready data (ARD), defined as cloud-screened surface reflectance products delivered in regular tiles [46], or higher level time-series products [47]. Such data can enable large area cropland management mapping, independent of costly calibration efforts, which rely on ancillary ground data or large area characterization of vegetation phenology [4]. The Landsat archive is the longest uninterrupted global satellite dataset and thus provides an accessible and consistent data basis for deriving such information. In this study, we compare different approaches for Landsat time-series aggregation. We used consistently produced Landsat Collection 1 data and openly available processing frameworks for generating ARD to promote readily applicable data and methods for large area mapping at medium spatial resolution. The key aim of this paper is to present a set of scientifically sound and openly accessible good practice recommendations for operational national scale mapping of cropland use intensity, demonstrated using the case of mapping cropping practices across Turkey. Specifically, our objectives are to: • Test the performance of Landsat time-series binning methods for mapping cropping practices on annual croplands in Turkey.

•
Investigate the spatial patterns of cropping practices across Turkey. • Compile a set of good practice recommendations for Landsat-based mapping of cropping practices over large areas.

Study Area
Located at the intersection of the European and Asian continent, Turkey's geopolitical role in agricultural production and trade has persisted through centuries. Turkey covers an area of 783,000 km 2 . To date, roughly one-third of the country's land is cultivated [45]. Due to strong biophysical and cultural gradients, Turkey's agricultural lands comprise nine distinct agro-ecosystems that differ in biophysical characteristics, crop types, and cropping practices [45]. In 2015, Turkey's cultivated lands included 15.7 Mha of annual cropland, plus an additional 4.1 Mha of temporary fallow land [48]. Cereals represent nearly two-thirds of the harvested land in Turkey, most notably wheat (35%), barley (12%), and maize (10%) [48]. Additionally, sugar beets (25%), potatoes (7%), sunflower (3%), cotton (3%), and pulses (2%) are grown. As extensive parts of the country have semi-arid to arid climate, irrigation is key for economically viable agricultural production [44]. Rainfed cultivation during spring and the first summer months is possible in most parts of the country. Contrarily, cultivation during the dry summer months requires irrigation for successful crop development in the water-limited regions of Turkey, like Central and South-Eastern Anatolia. Summer cropping in these regions is indicative of irrigation [49]. Irrigation is widespread and currently present on 4.9 Mha, being subject to further expansion and projected to reach 8.5 Mha by 2030 [45]. Seventy-eight percent of the irrigated land is irrigated with surface water, which is mostly applied through surface irrigation (92%), followed by sprinkler (6%) and drip (2%) irrigation [45]. However, national water Remote Sens. 2019, 11, 232 4 of 26 use efficiency was below 50%, causing substantial water losses [45]. Furthermore, up to 55% of the area equipped for irrigation was not irrigated on an annual basis due to mismanagement, water shortages, and institutional barriers at the local level, reflecting underused investments into irrigation infrastructure [44], which can cause strong spatio-temporal dynamics of rainfed and irrigated cropping, and thus to a high variability in land use intensity.

Data Pre-Processing and Class Catalogue
We downloaded 1143 Landsat 7 ETM+ and 1269 Landsat 8 OLI images for the year 2015 that cover our study area of 63 WRS-2 scenes. As we aimed for achieving high radiometric and geometric consistency, we used exclusively Landsat Collection 1 Level 1 Tier 1 precision terrain corrected (C1 L1TP T1) images with a cloud cover of less than 80%. For conversion into topographically normalized surface reflectance, we performed atmospheric correction and topographic normalization using a modified C-correction for topographic normalization to avoid topography-driven misclassification [50,51]. Generation of ARD, i.e., cloud and cloud shadow masking, radiometric correction (atmospheric, topographic, BRDF and adjacency effect correction), and data cubing (reprojection to ETRS89-LAEA projection and tiling into a 30 km × 30 km grid) were performed with the Framework for Operational Radiometric Correction for Environmental Monitoring (FORCE v. 1.1, freely available at http://force. feut.de), based on the algorithm described in Reference [52].
For classification, we included several auxiliary features. We used the 1 Arc-Second (~30 m) digital elevation model acquired by the Shuttle Radar Topography Mission [53] to derive elevation and hillslope. Additionally, we generated per-pixel latitude and longitude in geographic coordinates to account for large-scale variability in land cover characteristics.
We used a hierarchical class definition, first distinguishing annual cropland from other land cover and land use classes. In a next step, we classified annual croplands into five distinct classes of cropping practices, which characterize the annual cropland system of Turkey and relate to its use intensity ( Table 1). The hierarchical approach enables the evaluation of the cropping practice maps independently of the respective cropland mask used. The areal extent of the cropping practice classes was expected to vary strongly. While wheat production in Turkey covers extensive areas, greenhouse cultivation only accounts for a marginal share of the cropland area [45,48]. Accurately capturing small classes in large area mapping poses additional challenges related to training data collection and validation [54]. However, due to our objective to characterize the annual crop production system in a holistic manner, as well as the relevance of classes like semi-aquatic cropping and greenhouse cultivation regarding land and water resource consumption, we decided to integrate these classes in our class catalog. Examples of the spectral-temporal dynamics of our target classes are available in Appendix B.

Generation of Temporal Features
We used all available clear-sky observations (CSO) for temporal binning of all images into gap-filled eight-day, quarterly, and annual temporal features. While previous studies relied on annual temporal binning, we here introduced the quarterly binning as an intermediate trade-off between a high temporal resolution and an observation availability, which is sufficient for producing near gap-free coverage. The gap-filled eight-day temporal window represents an ideal case, where clear observations are available on a near-weekly basis. We derived temporally aggregated features using time-series gap filling, best-observation compositing, as well as the computation of spectral-temporal metrics. All higher-level products were generated using FORCE v.1.1.

Best-Observation Composites
We used 2403 atmospherically corrected and topographically normalized, cloud-screened Landsat Collection 1 L1TP T1 images to compute four best-observation composites [26] across Turkey. To identify the best observation, we conducted a parametric scoring which considered the temporal distance to the target day of the year, as well as the distance to clouds and cloud shadows and a haze score as parameters [27]. We used all available imagery from each quarter of 2015 to produce composites for the target days 15 February, 17 May, 16 August, and 16 November for the blue, green, red, near infrared, and both shortwave infrared bands, summing up to 24 features.
For quality checking and interpretation, we derived compositing flags containing information on the quality, number of CSOs per pixel, the acquisition date of the best observation, the difference between the acquisition date and target day of the year (DOY), as well as the sensor. Additionally, we computed the CSO count for the entire year, considering all areas free of clouds, cloud shadows, snow, and ice, which were not compromised by radiometric saturation.

Spectral-Temporal Metrics
We generated eleven band-wise spectral-temporal metrics for each quarter and the entire year using the CSOs available in the respective period. We computed the minimum, 25 th percentile, median, 75 th percentile, and maximum, mean spectral reflectance as well as the inter-quartile-range, range, and standard deviation of all reflectance values during each period. Additionally, we calculated the skewness and kurtosis of the distribution of reflectance values in the respective period. In total, the eleven spectral-temporal metrics for each quarter and spectral band sum up to 264 quarterly features, plus 66 spectral-temporal metrics from all observations of 2015.

Equidistant Time Series of Tasseled Cap Components
The Tasseled Cap transformation represents a linear transformation of the Landsat spectral bands into components which express selected physical scene characteristics [55]. We applied the Tasseled Cap transformation based on the coefficients for Landsat surface reflectance [56] on the CSOs to generate an equidistant eight-day interval time series of the Tasseled Cap Brightness, Greenness, and Wetness components. We used a data-density weighted ensemble of three Radial Basis Function (RBF) convolution filters for smoothing and gap-filling of the time series [34]. We chose to produce a time series of three Tasseled Cap components over a time series of standard vegetation indices since the Tasseled Cap transformation integrates the entire spectral information content of the six Landsat bands. While the vegetation signal is emphasized by the Tasseled Cap greenness component, the Tasseled Cap wetness, and brightness components potentially aid in the classification of our target classes such as semi-aquatic croplands and greenhouses.
The RBF approach is an ensemble technique, which combines multiple kernels to reduce noise and outliers in a time series while enabling to capture the variability of the signal in managed systems. We defined an ensemble of Gaussian kernels with σ 1 = 8, σ 2 = 16, and σ 3 = 32 days, respectively and applied a kernel-cutoff parameter that restricted the absolute width of the Gaussian kernels to ±21, 41, and 82 days relative to the target date. These kernels are relatively narrow compared to previous studies with a focus on natural ecosystems [34] and provide a trade-off between data gap reduction and the description of dynamic events, such as harvests. We finally aggregated the three kernels by calculating a data-density weighted mean, which gives preference to kernels with a higher data density for the final estimation.
The procedure resulted in a near gap-free eight-day time-series of three Tasseled Cap components for the year 2015, which consisted of 135 features. We further calculated the mean, standard deviation, minimum, and maximum of the Brightness, Greenness, and Wetness time series for the entire year, resulting in 12 additional features.

Training Data & Classification
We collected training polygons with a minimum extent of nine pixels (0.8ha), using the composites, an Enhanced Vegetation Index (EVI) time series for the years 2014-2016 (produced following the methods for the Tasseled Cap time series), and high-resolution imagery available in Google Earth. We collected 1925 training polygons in the different agricultural regions, comprising the Middle North, Aegean, Thrace, Mediterranean, Northeast, Southeast, Black Sea, Middle East, and Middle South regions [45], to capture the heterogeneous management regimes across Turkey. We reduced the number of training samples to 15% of the pixels in each polygon, with a minimum of two and a maximum of 30 pixels, resulting in 10,340 training points (of which 50.2% represented annual croplands).
We compiled 22 different subsets of the 505 input features ( Table 2). We categorized the feature subsets based on the employed temporal binning. The annual scheme refers to annual spectral-temporal metrics, combinations of composites and spectral-temporal metrics from all quarters of the year, all quarterly composites, all quarterly spectral-temporal metrics, or annual statistics derived from the Tasseled Cap time series. The bi-quarterly scheme refers to six combinations of two quarterly composites and spectral-temporal metrics; whereas the quarterly feature sets include composites and spectral-temporal metrics from one quarter only. For the quarterly and annual scheme, we further combined all quarterly feature subsets with annual spectral-temporal metrics. For the weekly scheme, we used the eight-day time series of Brightness, Greenness, and Wetness, as well as the time series together with the annual Tasseled Cap statistics. We used Random Forest classification models [57] for classification of each feature subset using n = 500 trees and included the square root of the number of features at each split. This procedure was used to separately produce binary cropland maps, as well as cropping practice maps for the study region.

Validation Data & Accuracy Assessment
Assuming user's accuracies of 0.85 for all classes and targeting a standard error of the overall accuracy of 1% regarding binary cropland mapping and 2% for the cropping practice mapping, we determined a required sample size of 1275 for the binary and 391 for the cropping practices map, respectively [58]. We performed a stratified allocation of the reference samples due to high imbalances in the extent of specific land uses. We produced a preliminary classification based on all quarterly spectral-temporal metrics and composites, which represented the five cropping practice classes plus nine sub-classes not related to annual croplands. This map provided the strata for implementing a stratified random sampling, allowing us to allocate samples in diverse types of non-cropland areas, such as grasslands, shrublands, or urban environments. We calculated the confidence intervals of accuracies and area estimates for four different sample allocation schemes and determined an allocation scheme with a class-wise minimum of 50 samples, and the remainder was distributed according to class weights derived from the preliminary map as a suitable allocation [54]. The final sample comprised 1447 validation points, 465 of which were annual cropland samples (166 samples for winter cropping: 106 for summer cropping, 63 for semi-aquatic cropping, 67 for double cropping, and 63 for greenhouse cultivation).
We pre-compiled relevant information for each validation point, containing image chips of pixel-based composites displayed in false-color RGB at three different zoom levels, pixel spectra of the point locations, and an EVI time series spanning 2014-2016 (see Appendix B for examples). Based on this phenological information, we determined the class label for each validation sample. We further considered the pixel extent of each sample in Google Earth to verify the label against high-resolution imagery from 2015, or the closest acquisition year available. This was particularly useful to increase the certainty for separating cropland from grassland and for identifying greenhouses. Two trained interpreters crosschecked 25% of the samples, finding interpreter agreement in more than 95% of all cases. Reference points where class labels could not be determined with high confidence were removed. We predicted the reference data points for each classification model and computed area-adjusted overall accuracies as well as area-adjusted class accuracies [54].

Clear Sky Observation Density
We registered an average of 20.42 CSOs per pixel (minimum 0; maximum 62) during the study period. A spatially explicit visualization of CSOs ( Figure 1) revealed data-scarce areas in the Eastern parts of the Black Sea region, while observation density in the across-track overlap areas in South Eastern Anatolia (southeast Turkey, alongside the Syrian border) was the highest. Time windows covering single quarters of the year showed few gaps, except for the first quarter, where 9% of the study area remained unobserved (Table 3).

Clear Sky Observation Density
We registered an average of 20.42 CSOs per pixel (minimum 0; maximum 62) during the study period. A spatially explicit visualization of CSOs ( Figure 1) revealed data-scarce areas in the Eastern parts of the Black Sea region, while observation density in the across-track overlap areas in South Eastern Anatolia (southeast Turkey, alongside the Syrian border) was the highest. Time windows covering single quarters of the year showed few gaps, except for the first quarter, where 9% of the study area remained unobserved (Table 3).

Classification Accuracies
We calculated the area-adjusted overall accuracies as well as their 95% confidence intervals for each binary cropland and cropping practice classification for all 22 input feature sets. The binary classification was robust (Figure 2, black signature) with overall accuracies exceeding 92% throughout, whereas variations in accuracy values were apparent for the cropping practice classification (Figure 2, grey signature).
Generally, features generated from annual data outperformed the seasonally restricted approaches, while the low accuracies of TC_STATS and ANNUAL_STM suggest that retaining seasonal information in temporal aggregates through seasonal binning is essential. We found no differences in overall accuracy when using only composites or only spectral-temporal metrics. Combining both, however, improved the overall accuracy. In the category of bi-quarterly inputs, involving data from the spring and summer months, performed well. Similarly, the single-quarter classifications showed that spectral-temporal metrics and composites from the summer quarter were sufficient for achieving overall accuracies above 85%. Using spectral-temporal metrics and composites from the data-scarce first and fourth quarters, however, yielded the lowest accuracies. Adding feature sets from other seasons generally increased accuracy. Including geographic location (latitude, longitude) and topography (elevation, slope) as auxiliary variables increased the overall accuracies by 1%-3% throughout, indicating their benefit for large area mapping.
approaches, while the low accuracies of TC_STATS and ANNUAL_STM suggest that retaining seasonal information in temporal aggregates through seasonal binning is essential. We found no differences in overall accuracy when using only composites or only spectral-temporal metrics. Combining both, however, improved the overall accuracy. In the category of bi-quarterly inputs, involving data from the spring and summer months, performed well. Similarly, the single-quarter classifications showed that spectral-temporal metrics and composites from the summer quarter were sufficient for achieving overall accuracies above 85%. Using spectral-temporal metrics and composites from the data-scarce first and fourth quarters, however, yielded the lowest accuracies. Adding feature sets from other seasons generally increased accuracy. Including geographic location (latitude, longitude) and topography (elevation, slope) as auxiliary variables increased the overall accuracies by 1%-3% throughout, indicating their benefit for large area mapping.  Four models exceeded 90% overall accuracy in both binary cropland and cropping practice mapping (Tables A1-A4). These were ALL (all features), QRT_STM_CMP (quarterly composites and spectral-temporal metrics), WKL_TC (eight-day time series of three Tasseled Cap components), and WKL_TC_STATS (eight-day time series of three Tasseled Cap components complemented with four annual statistics). All these feature sets retain temporal information throughout the different seasons. Class wise user's and producer's accuracies of these four best models showed robust mapping of spring and winter cropping, summer cropping, and double cropping, whereas semi-aquatic cropping and greenhouse cultivation showed low producer's accuracies (Figure 3).
We based the following analyses on model WKL_TC_STATS, because the cropland extent was closest to official statistics (see Section on Cropland Area Estimates), and the spatial consistency was found to be the highest.
spectral-temporal metrics), WKL_TC (eight-day time series of three Tasseled Cap components), and WKL_TC_STATS (eight-day time series of three Tasseled Cap components complemented with four annual statistics). All these feature sets retain temporal information throughout the different seasons. Class wise user's and producer's accuracies of these four best models showed robust mapping of spring and winter cropping, summer cropping, and double cropping, whereas semi-aquatic cropping and greenhouse cultivation showed low producer's accuracies (Figure 3). We based the following analyses on model WKL_TC_STATS, because the cropland extent was closest to official statistics (see Section on Cropland Area Estimates), and the spatial consistency was found to be the highest.

Evaluating Cropland Maps
Cropland area estimates of the maps with the highest accuracies ranged between 10.73 Mha and 11.72 Mha (±0.8 Mha; Table 4), thereby being substantially lower than reported statistics of cropland area extent, with a country total of 15.7 Mha [48]. We compared province-level (NUTS3) cropland proportions (%) from the binary cropland map WKL_TC_STATS, CORINE Land Cover 2012 [59] as well as the GFSAD30 cropland extent product with the baseline year 2015 [60] with reported statistics of the sown area for the year 2015 [48] (Figure 4). Linear regression revealed high correlation of the cropland extent estimates from WKL_TC_STATS (r² = 0.85) and CORINE 2012 (r² = 0.84), whereas the correlation with estimates from GFSAD30 was comparatively low (r² = 0.64). Regression coefficients revealed the systematic underestimation of cropland area for our cropland mask (slope of regression: β = 1.1), although it was much lower than the overestimation in CORINE 2012 (β = 0.7) and GFSAD30 (β = 0.5). For a more detailed comparison of the province-level estimates, we investigated the accuracy (bias), precision (repeatability) and uncertainty (root mean squared error) of mapped versus reported cropland for all three products. We found lower accuracy compared to CORINE 2012 (i.e., a higher bias), but higher precision (i.e., higher consistency of the prediction) and similar uncertainty (Table 5). Our cropland area estimates had the highest precision of all products investigated here, and their uncertainty was

Evaluating Cropland Maps
Cropland area estimates of the maps with the highest accuracies ranged between 10.73 Mha and 11.72 Mha (±0.8 Mha; Table 4), thereby being substantially lower than reported statistics of cropland area extent, with a country total of 15.7 Mha [48]. We compared province-level (NUTS3) cropland proportions (%) from the binary cropland map WKL_TC_STATS, CORINE Land Cover 2012 [59] as well as the GFSAD30 cropland extent product with the baseline year 2015 [60] with reported statistics of the sown area for the year 2015 [48] (Figure 4). Linear regression revealed high correlation of the cropland extent estimates from WKL_TC_STATS (r 2 = 0.85) and CORINE 2012 (r 2 = 0.84), whereas the correlation with estimates from GFSAD30 was comparatively low (r 2 = 0.64). Regression coefficients revealed the systematic underestimation of cropland area for our cropland mask (slope of regression: β = 1.1), although it was much lower than the overestimation in CORINE 2012 (β = 0.7) and GFSAD30 (β = 0.5). For a more detailed comparison of the province-level estimates, we investigated the accuracy (bias), precision (repeatability) and uncertainty (root mean squared error) of mapped versus reported cropland for all three products. We found lower accuracy compared to CORINE 2012 (i.e., a higher bias), but higher precision (i.e., higher consistency of the prediction) and similar uncertainty (Table 5). Our cropland area estimates had the highest precision of all products investigated here, and their uncertainty was similar to estimates obtained from CORINE 2012. However, the overestimation of cropland in GFSAD30 as well as CORINE 2012, as compared to the national statistics, might partly be related to the product class definitions, which include temporary fallow croplands. uncertainty in thousand hectares.

Model
Accuracy  We overlaid the binary cropland masks resulting from the four best classifications, finding agreement in 93.2% of the area, corresponding to 11.3% of cropland, and 81.9% of the other class. Disagreement across cropland masks (Figure 5Error! Reference source not found., first column) occurred on parcel boundaries, within seasonally inundated wetlands and lakes, and in small-scale, fragmented or low-intensity cropping systems.
A comparison of the map products further demonstrated how these differences unfolded spatially and revealed some advantages of the presented cropping practice map ( Figure 5). For instance, we captured parcels of temporary fallow croplands (e.g., Figure 5A,B,C). Furthermore, our mapping methods are pixel-based and thus not restricted to a minimum mapping unit, which allowed for capturing small parcels in fragmented landscapes (e.g., Figure 5C). Additionally, we found very high agreement between our semi-aquatic cropping class and CORINE's rice field class ( Figure 5 D,E). The thematic detail of our maps allowed for identifying greenhouse cultivation ( Figure  5C) or the distinction between winter, summer, and double cropping in intensively irrigated systems ( Figure 5F).  We overlaid the binary cropland masks resulting from the four best classifications, finding agreement in 93.2% of the area, corresponding to 11.3% of cropland, and 81.9% of the other class. Disagreement across cropland masks ( Figure 5, first column) occurred on parcel boundaries, within seasonally inundated wetlands and lakes, and in small-scale, fragmented or low-intensity cropping systems.
A comparison of the map products further demonstrated how these differences unfolded spatially and revealed some advantages of the presented cropping practice map ( Figure 5). For instance, we captured parcels of temporary fallow croplands (e.g., Figure 5A-C). Furthermore, our mapping methods are pixel-based and thus not restricted to a minimum mapping unit, which allowed for capturing small parcels in fragmented landscapes (e.g., Figure 5C). Additionally, we found very high agreement between our semi-aquatic cropping class and CORINE's rice field class ( Figure 5D,E). The thematic detail of our maps allowed for identifying greenhouse cultivation ( Figure 5C) or the distinction between winter, summer, and double cropping in intensively irrigated systems ( Figure 5F).

Spatial Patterns of Cropping Practices
Error-adjusted class area estimates across the four models were consistent ( Figure 6). According to the selected model (WKL_TC_STATS), spring and winter crops cover 71.3% (±2.3%), and summer crops 15.8% (±2.2%) of the total cropland area. Double cropping occurred on 8.5% (±1.6%), and semiaquatic crops were cultivated on 4.0% (±1.9%) of the cropland area. Greenhouses accounted for only 0.32% (±0.2%) of the cropland area, corresponding to an estimated 36,752 ha, which is well in line with the 38,605 ha reported in official statistics [48]. Figure 6. Error-adjusted area estimates of cropping practice classes (from left to right: winter/spring cropping, summer cropping, double cropping, semi-aquatic cropping, greenhouse cultivation) across the four best classification models (see Table 2 for abbreviations). Error bars indicate 95% confidence intervals.
We visually identified regional differences in cropping practices across Turkey (Figure 7). Semiaquatic cropping, i.e., paddy rice cultivation, was prevalent in Sinop province in the Black Sea region Agglomerations of greenhouses characterized the agricultural landscape along the Mediterranean coastline, e.g., East of Antalya ( Figure 7D,F). We found intensive cultivation patterns in the Şanlıurfa and Mardin provinces of the South-East Anatolia Project irrigation scheme ( Figure 7G). The cropping system in Eastern Anatolia showed heterogeneous parcel sizes and management systems, consisting of winter, summer, and double cropping ( Figure 7H,I).
Province-level (NUTS-3) shares of annual cropland by province showed spatial hotspots of cropland in the Northwest, Central Anatolia, and South-Eastern Anatolia (Figure 8). Fractions of cropping practices in 2015 revealed the dominance of spring and winter cropping across Turkey. Fifty-three provinces were dominated (>60%) by winter and spring cropping. Summer cropping was the second most common cultivation strategy and distributed across almost all provinces, with spatial clusters along the Aegean Sea and the Eastern Mediterranean Sea. Double cropping occurred on less than 9% of croplands and clustered spatially in South-East Anatolia, e.g., Mardin (40%) and Şanlıurfa (21%), in the river valleys of the Aegean Sea region, e.g., İzmir (37%), and parts of the Black Sea region where cropland was limited. Semi-aquatic cropping was relevant in Edirne (21%) along the Greek border and Sinop (21%) in the Central Black Sea region. Greenhouse cultivation clustered along the Mediterranean coastline, especially Antalya (17%) and İçel (10%).

Figure 6.
Error-adjusted area estimates of cropping practice classes (from left to right: winter/spring cropping, summer cropping, double cropping, semi-aquatic cropping, greenhouse cultivation) across the four best classification models (see Table 2 for abbreviations). Error bars indicate 95% confidence intervals.
We visually identified regional differences in cropping practices across Turkey (Figure 7). Semi-aquatic cropping, i.e., paddy rice cultivation, was prevalent in Sinop province in the Black Sea region ( Figure 7A) and the Evros river valley alongside the Turkish-Greek border. Here, misclassifications of summer crops on the unflooded parcel-boundaries were apparent ( Figure 7B). Distinct patterns of summer and double cropping occurred in the river valleys flowing into the Aegean Sea, most notably in the provinces of Aydın, Maniza, andİzmir ( Figure 7C). Central parts of Anatolia contained large-scale cropping systems ( Figure 7E). Agglomerations of greenhouses characterized the agricultural landscape along the Mediterranean coastline, e.g., East of Antalya ( Figure 7D,F). We found intensive cultivation patterns in theŞanlıurfa and Mardin provinces of the South-East Anatolia Project irrigation scheme ( Figure 7G). The cropping system in Eastern Anatolia showed heterogeneous parcel sizes and management systems, consisting of winter, summer, and double cropping ( Figure 7H,I).
Province-level (NUTS-3) shares of annual cropland by province showed spatial hotspots of cropland in the Northwest, Central Anatolia, and South-Eastern Anatolia (Figure 8). Fractions of cropping practices in 2015 revealed the dominance of spring and winter cropping across Turkey. Fifty-three provinces were dominated (>60%) by winter and spring cropping. Summer cropping was the second most common cultivation strategy and distributed across almost all provinces, with spatial clusters along the Aegean Sea and the Eastern Mediterranean Sea. Double cropping occurred on less than 9% of croplands and clustered spatially in South-East Anatolia, e.g., Mardin (40%) andŞanlıurfa (21%), in the river valleys of the Aegean Sea region, e.g.,İzmir (37%), and parts of the Black Sea region where cropland was limited. Semi-aquatic cropping was relevant in Edirne (21%) along the Greek border and Sinop (21%) in the Central Black Sea region. Greenhouse cultivation clustered along the Mediterranean coastline, especially Antalya (17%) andİçel (10%).

Discussion
The openly accessible Landsat archive provides unseen opportunities for informing land use intensity assessments over large areas. The presented study generated a set of good practice recommendations for mapping cropping practices over large areas, considerations for transferring the presented approaches into eras with different sensor constellations, uncertainties related to our approach, as well as insights concerning Turkey's cropland management regimes.

Good Practice Recommendations
We highlight that the binning of dense image time series within pre-defined temporal windows is an efficient technique to produce widely consistent image features over space and time. Overall, the robustness, high classification accuracies and spatial consistency of our maps demonstrate the applicability of currently operational sensor constellations for mapping cropping practices across environmental and management gradients. We were able to identify some patterns when testing several temporal binning schemes and methods, which translate into a set of good practice recommendations.
First, retaining the seasonal information contained in the intra-annual time series is of the utmost importance for mapping cropping practices. Our tests revealed that annual binning into spectraltemporal metrics or time series statistics performed poorly, as the phenological information is lost (Figure 2). In line with a recently presented approach on crop type mapping [61], increasing the temporal resolution of the included features had a positive impact on classification accuracies. We thus suggest narrow, e.g., near-weekly binning wherever data availability, allows for satisfactory spatial coverage during cloud-prone seasons. Alternatively, quarterly binning provided a good tradeoff between temporal detail and observation availability that allowed for producing robust results on national, up to continental scales [25].

Discussion
The openly accessible Landsat archive provides unseen opportunities for informing land use intensity assessments over large areas. The presented study generated a set of good practice recommendations for mapping cropping practices over large areas, considerations for transferring the presented approaches into eras with different sensor constellations, uncertainties related to our approach, as well as insights concerning Turkey's cropland management regimes.

Good Practice Recommendations
We highlight that the binning of dense image time series within pre-defined temporal windows is an efficient technique to produce widely consistent image features over space and time. Overall, the robustness, high classification accuracies and spatial consistency of our maps demonstrate the applicability of currently operational sensor constellations for mapping cropping practices across environmental and management gradients. We were able to identify some patterns when testing several temporal binning schemes and methods, which translate into a set of good practice recommendations.
First, retaining the seasonal information contained in the intra-annual time series is of the utmost importance for mapping cropping practices. Our tests revealed that annual binning into spectral-temporal metrics or time series statistics performed poorly, as the phenological information is lost (Figure 2). In line with a recently presented approach on crop type mapping [61], increasing the temporal resolution of the included features had a positive impact on classification accuracies. We thus suggest narrow, e.g., near-weekly binning wherever data availability, allows for satisfactory spatial coverage during cloud-prone seasons. Alternatively, quarterly binning provided a good trade-off between temporal detail and observation availability that allowed for producing robust results on national, up to continental scales [25].
Second, summer composites and spectral-temporal metrics produced satisfactory overall accuracies for Turkey. Thus, images from June to September essentially describe the spectral-temporal characteristics of our target classes. However, other seasonal windows might be more important when applying the approach in regions with different climate or management regimes. Similar to a study focusing on continental scale cropland mapping [25], classification accuracies increased when features from all quarters were included. Doing so presents an alternative to data-driven identification of the phenologically relevant season for accurate class discrimination or expert knowledge acquisition, as the first is challenging to implement and the latter is commonly not available in a spatially explicit manner across large areas.
Third, including auxiliary features describing geographic location and topography improved mapping accuracy. In line with other studies [62], we thus recommend the integration of such features in large area mapping as a common practice.
Our mapping efforts were relatively inexpensive. Two trained interpreters conducted the collection of training data and labeling of validation points in 320 work hours. Overall computation times were below two weeks for pre-processing, computation of all temporal features, and classification. However, we performed all computations using multi-core processing, which reduced processing times substantially. Storage space requirements were highest for pre-processed Landsat data (2.6 TB), composites and spectral-temporal metrics (0.8 TB), as well as Tasseled Cap time series and statistics (0.4 TB), but can be further reduced via image compression. In cases where processing and storage infrastructure is not available, processing can alternatively be performed using readily available ARD on cloud processing platforms [63]. The map product presented in this paper is available online (https://doi.pangaea.de/10.1594/PANGAEA.897547).

Temporal Transferability
The applicability of individual methods is dependent on the operational sensor constellation during the period of interest. For past decades, highly irregular inter-annual acquisition densities in the Landsat archive pose challenges for long-term mapping approaches [28]. Seasonally restricted spectral-temporal metrics are promising tools for mapping land use across large areas in past decades [31]. Average clear-sky observation density across Turkey between 1984 and 2017 is rarely below three observations during summer, suggesting that efforts targeting the mapping of cropping practices in Turkey over past decades are potentially worthwhile. However, these demand analyses of the quality of spectral-temporal features under scenarios of scarce and temporally disperse observations.
On the contrary, mapping cropping practices in the years past 2015 will benefit from novel sensor constellations with higher revisit frequency, which increases the probability of cloud-free acquisitions. Improvements of sensor integration [64,65] and automated image processing [66] now enable the combined use of intra-annual Landsat 8 and Sentinel 2A+B acquisitions for agricultural mapping applications [61]. Increased observation density will likely improve the mapping of cropland management interventions, such as harvests, or flooding of semi-aquatic crops. Recent advances also suggest the applicability of radar data to further supplement mapping efforts at high thematic detail [67], which can also be temporally aggregated for large area applications [68].

Uncertainties and Limitations
The products generated here meet common quality criteria regarding classification accuracy, and the high spatial consistency and thematic detail of these maps have the potential to satisfy the growing need for land use intensity datasets covering large areas [1,2]. However, remaining uncertainties relate to underestimated cropland extent and the omission of semi-aquatic cropping. We underestimated cropland extent by 25%, mostly due to confusion between spring crops and grasslands, as both classes represent herbaceous vegetation layers underlying precipitation-driven phenology and management interventions. Differences in cropland area estimates can be reduced by employing adaptive thresholds for classification probability, in order to match higher quality cropland area estimates, such as province-level statistics [18]. The omission of semi-aquatic cropping occurred due to the confusion with summer cropping, owing to strong phenological similarities. This insight supports the hypothesis that image availability during the flooding stage is crucial for accurate representation of semi-aquatic crops, such as paddy rice [69].
While our study aims at the characterization of annual croplands, spatially explicit information on grasslands and perennial croplands is essential for a better characterization of agricultural land use intensity [1]. In 2015, permanent pastures accounted for 14.6 Mha and perennial croplands for 3.3 Mha in Turkey [48], highlighting the need for detailed characterization of perennial crop systems to improve the understanding of the entire production system. However, the mapping of perennial croplands and grasslands poses various challenges from a remote sensing perspective [70][71][72]. These arise from heterogeneous class characteristics (crop type, grass species, phenology, planting density, irrigation, and mowing frequency) and the lack of datasets that allow for discriminating natural from managed grasslands, as well as cropped plantations from timber plantations.

Cropland Intensity and Water Resources in Turkey
Cropland management strongly determines agricultural water requirements [73,74]. We show that vast areas of annual croplands exist in semi-arid parts of Turkey, where total annual precipitation ranges between 200 mm and 600 mm, with near-zero precipitation during summer and particularly high evaporation rates [45]. While we show that spring and winter cropping, covering extensive areas of winter cereals with relatively low water requirements, dominate in Turkey [45,75], our results reveal that 28% of the national cropland area was cultivated during summer in 2015 (including summer, double, and semi-aquatic cropping). The recent expansion of irrigated agriculture, coupled with high irrigation water requirements led to a 48% increase in agricultural water requirements between 1992 and 2008 [76]. In the case of the South-East Anatolia Project, the expansion of irrigated cotton cultivation successfully boosted the regional economy [77], yet this occurred at the cost of drastic water consumption increases [49,78].
Current irrigation water use in Turkey is excessive and inefficient [44]. Generally, low water prices due to a lack of volumetric pricing schemes foster excessive water use [79]. Irrigation efficiencies in Turkey commonly do not exceed 50%, mostly due to the prevalence of surface irrigation, and high conveyance losses [80]. While the Turkish government envisions a 73% expansion of irrigated agriculture until 2030 [81], doing so without drastic institutional, agronomic, and engineering changes poses a threat to the national water resource base. Climate change further aggravates crop water deficits across Turkey [82], causing yield declines on rainfed and irrigated croplands [83,84]. The adoption of more efficient irrigation techniques offers water savings potential in the region [85], whereas saved water can be re-allocated and enable irrigation in other production systems [81,86]. Improving water use efficiency through reduced conveyance and evaporative losses is thus imperative to foster more sustainable production and should be of the utmost priority of planning authorities to prevent water scarcity and maintain environmental flow requirements in future intensification pathways [87].

Conclusions
We based our approach exclusively on openly available datasets (Landsat Collection 1 imagery) and algorithms (FORCE v1.1) that allow for streamlining pre-processing chains and producing analysis-ready datasets. We thereby demonstrated the benefits of openly accessible satellite image datasets and pre-processing frameworks for improving our understanding of agricultural land use intensity across large areas. This study provides a robust mapping framework to disentangle annual cropland management, using Turkey as a case study.
The fine spatial and temporal detail of the resulting maps allow for tracking cropping practices at 30 m spatial resolution and at annual intervals. Our maps can reveal areas of low cropland use intensity, such as rainfed cultivation on small parcels or fallow croplands, as opposed to intensity hotspots, such as double-cropped areas, flood-irrigated croplands, or areas with greenhouse cultivation. This is a substantial improvement compared to statistics at aggregated units, or other map products of reduced thematic detail, or coarser spatial or temporal resolution. The detail of our maps has the potential to inform stakeholder-related processes and assessments of land and water resource consumption over large areas.
We formulated good practice recommendations for binning Landsat time-series into temporal features for wall-to-wall characterization of cropping practices. We recommend binning into quarterly or finer temporal windows, wherever data availability is sufficient. We achieved the best classification results by using quarterly composites and spectral-temporal metrics, or gap-filled time series of Tasseled Cap components as input features. Auxiliary features on topography and geographic location improved classification accuracies throughout. Knowledge of regional phenological characteristics of the target classes is essential. However, we generally recommend the integration of features from multiple seasons, when difficulties in identifying key phenological windows prevail.
The generic temporal binning of all available clear sky observations into eight-day or quarterly time windows was a successful strategy in a Mediterranean setting but is likely applicable in climatic zones with a good data availability in the growing season. In this light, transferring our approach to other regions and eras requires investigating the effect of observation density on the consistency of temporal features, to determine minimum data requirements and subsequent trade-offs in mapping accuracies. The presented methods are potentially applicable across yearly time series to get insights into the inter-annual cropland management dynamics and to derive efficiency indicators such as irrigation ratios. Complementing the presented methods with techniques for detecting parcel sizes (e.g., References [17,88]) could enable the long-term evaluation of governmental land consolidation policies. Such information could ultimately improve our understanding of long-term developments of cropland management, and thus land and water resource demand at the regional to continental scale.  Table A1. Confusion matrix for variable subset "ALL". Cells populated with adjusted probabilities [54]. WC: winter cropping, SC: summer cropping, AC: semi-aquatic cropping, DC: double-cropping, GH: greenhouse cultivation.