Combining Fractional Cover Images with One-Class Classiﬁers Enables Near Real-Time Monitoring of Fallows in the Northern Grains Region of Australia

: Fallows are widespread in dryland cropping systems. However, timely information about their spatial extent and location remains scarce. To overcome this lack of information, we propose to classify fractional cover data from Sentinel-2 with biased support vector machines. Fractional cover images describe the land surface in intuitive, biophysical terms, which reduces the spectral variability within the fallow class. Biased support vector machines are a type of one-class classiﬁers that require labelled data for the class of interest and unlabelled data for the other classes. They allow us to extrapolate in-situ observations collected during ﬂowering to the rest of the growing season to generate large training data sets, thereby reducing the data collection requirements. We tested this approach to monitor fallows in the northern grains region of Australia and showed that the seasonal fallow extent can be mapped with > 92% accuracy both during the summer and winter seasons. The summer fallow extent can be accurately mapped as early as mid-December (1–4 months before harvest). The winter fallow extent can be accurately mapped from mid-August (2–4 months before harvest). Our method also detected emergence dates successfully, indicating the near real-time accuracy of our method. We estimated that the extent of fallow ﬁelds across the northern grains region of Australia ranged between 50% in winter 2017 and 85% in winter 2019. Our method is scalable, sensor independent and economical to run. As such, it lays the foundations for reconstructing and monitoring the cropping dynamics in Australia.

have focused on mapping fallows (i) as part of end-of-season crop type maps [10][11][12]; (ii) by elimination from the active cropland [4]; (iii) by analysing normalised time series of vegetation indices [7,13] and spectral matching techniques [14,15]; or (iv) by evaluating a pixel's greenness status in both space and time [16]. These approaches vastly rely on vegetation indices that are sensitive to soil colour and require vast amount of training data, including for the classes that are not of direct interest. Besides, they often require seasonal/yearly time series, which impedes in-season assessments and increases computational costs. As a result, near real-time monitoring of fallows with remote sensing is still lacking in many regions of the world.
There are two main challenges associated with mapping fallows at scale: the spectral heterogeneity and the training data requirements. First, spectral signatures of fallows are highly variable due to shifting environmental conditions, management practices and combinations thereof. Environmental conditions range from soil type and colour to soil moisture. Management practices include for instance windrowing, stubble retention, cultivation, no-till, and weeds control. As a result of the wide range of existing fallow practices, fallows are often misclassified as being active cropland depending on climatic and soil conditions [13], cropping techniques [17], crop failures [3], or as natural ecosystems due to similar regeneration trajectories. Second, supervised classification algorithms requires widespread and timely field data for training but such field data are often lacking. Accuracy depends, amongst other things, on the characteristics of the training data. It positively correlates with sample size [18][19][20], even though there is evidence that, in small regions, small, carefully-selected data sets yield similar accuracy to larger sets [21][22][23]. Accuracy is also affected by the presence of outliers [20,24,25] and imbalance among classes [19,20,25,26]. In the case of a fallow monitoring system, training a conventional classifier would imply collecting labelled data for the fallow class as well as for the crop classes across development stages, for a variety of plant density, and across a range of environmental and management conditions. At scale, such data would be impractical and costly to collect. One way to reduce training data requirements is to use one-class classifiers because these classifiers solely need labelled data for the class of interest [27][28][29][30]. Here, we propose a novel method to monitor fallows that addresses the class heterogeneity problem and that has parsimonious data collection requirements.
In this paper, we demonstrate that fallows can be monitored in near real-time across the northern grains region of Australia by combining one-class classifiers with fractional cover data. This combination of data and classifier is particularly interesting because it describes fallows in intuitive biophysical terms (sub-pixel proportions of soil and photosynthetic and non-photosynthetic vegetation) and alleviates the need for training data from the non-fallow (negative) class. Simultaneously, it increases scalability and reduces sensor dependency. We tested this method during the 2017 winter and 2018 summer seasons and found that the seasonal fallow extent can be mapped with accuracy up to four months prior to harvest. We also found that our method was also able to detect crop emergence, i.e., the transition from fallow to crop. Finally, we ensembled the optimal classifiers to reduce computing time, and showcased, for six consecutive seasons, how our method can monitor the cropland dynamics .

Study Area
This study focused on an area of circa 250,000 km 2 in the northern grains region of Australia (northern New South Wales and southeastern Queensland; Figure 1). Soil types vary from grey, brown, or black vertosols to red or brown sodosols and red or brown chromosols [31] resulting in strong variations of soil colour ( Figure 2). Availability of water is the main factor limiting crop productivity across the region. Most cropping is located between the 500 mm and 650 mm average annual rainfall isohyets. Annual rainfall is summer-dominant and extremely variable, especially in the north and west, with the result that cropping systems have increasingly evolved around opportunistic cropping rather than fixed rotations [32]. Planting for the summer season starts as early as September and harvests can last until late June while the winter season spans from April to December. For cereal crops grown for grain, double-cropping on a routine basis is not regularly practised due to the risk of crop failure as a result of water stress [33]. Previous studies in the area of interest evaluated the cropping intensity from 0.29 to 1.33 crops per year with an average of 0.94 [34]. Cropping intensity can be as low as 58% in the western and northern areas where rainfall is less reliable [32]. Location of the area of interest and of the reference data available for training and calibration. Grey areas represent the cropping areas as mapped in ACLUM. Insets A and B correspond to natural-colour Sentinel-2 images acquired on 2017/09/07. Inset A shows an area in the Goondiwindi region, where irrigated cotton is dominant (farm dams can be seen in white); inset B displays a dryland system in the Toowoomba region, where strip cropping is commonplace.
Fallows can be characterised according to their duration (short or long), weeds control (chemical or mechanical) and stubble management (stubble retention rate, stubble grazing, wind-rowing, or burning). During long fallows, fields are taken out of production for 12 to 18 months during which weeds are managed so that soil water accumulates. Short fallows last for a shorter period, typically a growing season. During summer fallows (a 5-7-month period between the harvest of last season's winter crop and the sowing of this season's wheat crop), rainfall events in excess of 20-30 mm can infiltrate below the evaporative zone and be stored for subsequent crop growth [35]. Fallows of 12-18 months are commonly used to transition from summer to winter crops [32]. Cultivated fallows involve repeated use of tillage implements whereas chemical fallows involve repeated use of non-residual herbicides to control weeds. The advantage of chemical fallows in years of low rainfall and yield potential was shown to result from greater water conservation (53 mm year −1 on average [36]). Efficiency of water capture and use by crops has further been improved by conservation tillage as well as no-till and crop residue retention which has been widely adopted across Australia [37]. The range of stubble management options, e.g., windrowing, can lead to a wide diversity of spectral signatures. All these factors contribute to generating a wide diversity of fallow types across the study area ( Figure 2).
We capitalised on an existing algorithm developed under Australian conditions to convert surface reflectance into cover fractions [42][43][44]. This algorithm uses linear unmixing to quantify the photosynthetic (PV), the non-photosynthetic (NPV) vegetation, and the remaining fraction of bare soil (BS) in each pixel. The unmixing model was initially calibrated with Landsat data based on 1500 in-situ observations covering a wide variety of vegetation, soil and climate types across Australia [42,43]. All in-situ observations were recorded between July 2002 and January 2013 and were concentrated during the southern-hemisphere autumn, winter and spring. The model uses six spectral bands (blue, green, red, near infrared, and two shortwave infrared bands). It sets a hard positivity constraint and a soft sum to 1 constraint, i.e., the model needs to optimise solving for the end member abundances, not that they should sum to one (although they generally do). This initial unmixing model retrieved cover fractions with a Root Mean Squared Error (RMSE) of 11%. Following a cross-sensor calibration procedure, it was recently adapted to Sentinel-2 images to provide a 10-m product, which necessitated resampling all 20-m bands to 10 m using bicubic convolution [44]. With these reflectance adjustments, the coefficients of determination (R 2 ) between Landsat and Sentinel-2 bands were ≥ 0.98 and the RMSEs were ≤ 0.017. Consequently, cover fractions retrieved from Landsat and Sentinel-2 thus strongly agree (R 2 ≥ 0.95; RMSEs ≤ 0.057). The output of the retrieval process is a 10-m four-band image: the first three bands corresponding to the cover fractions (PV-NPV-BS; see Figure 3) and the last band to the model fitting error, which was discarded in further analyses. We sourced detailed information about the cropland extent in Queensland and New South Wales from the Australian Land Use and Management (ALUM) classification (www.agriculture.gov. au/abares/aclump/land-use/data-download). Land use information at catchment scale has been produced for all states in Australia with dates and scales varying depending on when the data was collected and the intensity of land use. ALUM provides a three-tiered hierarchical structure, with primary, secondary and tertiary classes structured by their potential degree of modification and their impact on an initial land cover [45]. The land use features were manually digitised based on visual interpretation of multi-temporal satellite imagery, digital orthophotos, scanned aerial photography as well as ancillary data sets containing land use information and field observations. The reliability scale is 1:10,000 for New South Wales and 1:50,000 for Queensland. At a scale of 1:50,000, the area of the smallest mapped object is two hectares and the minimum width for linear objects is 50 m. The "cropping" class defined the extent of the cropland in our area of interest. It includes categories such as cereals, hay and sillage, oilseeds, cotton, or pulses. Cropland might be over-estimate because areas of pasture which appeared to be harvested for fodder or grazed often were mapped as "cropping". Overall, ALUM provides an accurate, complete and consistent description of land uses and of the cropland extent across our area of interest.

Reference Data
We collected reference data during two roadside surveys in September 2017 (winter season) and January 2018 (summer season). Roadside data collection has been shown to be a viable sampling approach for training data collection providing environmental and management gradients are surveyed [46]. For fallow mapping in the northern grains region of Australia, these gradients involve changes in soil colour, soil moisture, stubble retention and tillage. Previous research concluded that changes soil colour or soil moisture did not significantly influence model performance [43]. Sample selection biases are unlikely because fields were observed from a range of roads, from primary roads to dirt tracks. Therefore, sample selection biases due the survey method were unlikely. Field boundaries associated with each point-based label were subsequently drawn using Google Earth and in-season Sentinel-2 imagery. A total of 2009 ground-truth polygons were at hand of which 1133 were collected during summer and 876 were collected in the winter ( Figure 1). The average field size was 50 ha (5000 pixels).
A second data set consisting of 44 geolocated data points was collected by private agronomists who visited these fields weekly and recorded emergence when half the plants along a 1-m transect in the sowing row reached emergence.

Methods
Our objective was to develop a method to map fallows in near real-time across the northern grains region of Australia. To increase the temporal coverage of field data and reduce fieldwork, we assumed, based on domain knowledge, that fallow labels could be extended across the first half of the growing season. Indeed, the cover fractions of fallows are likely to remain stable over time because, in the northern grains zone of Australia, good management practices involve spraying weeds soon after they emerge. As a result, fallow labels collected during flowering, i.e., when the fallow extent is minimal, can be extended to the first half of the season, thereby extending the temporal coverage of positive data at no cost. By contrast, crop labels cannot be extended because their emergence date is unknown. At best, they can be considered unlabelled, which prevents the use of conventional classifiers and justifies the one-class approach.
Based on these principles, our classification method has four main steps (

Data Pre-Processing And Resampling
We extracted positive, negative, and unlabelled data from the field observations. The extraction period spanned from 6 April to 2 September for the winter season and from 10 October to 20 January for summer season. To provide reliable estimates of the space-time generalisation accuracy, the surveyed fields were split into two independent sets of polygons (50:50): one set was used for training classifiers and the other for validation. We randomly selected 15 pixels per field-date for both the positive and unlabelled data to reduce redundancy in the data.
Outliers can significantly impact the performance of one-class classifiers [24,25]. As positive data contained outliers due to imperfect cloud and shadow masking, we removed them using nonparametric iterative trimming [47]. Iterative trimming identifies outliers from the data by estimating the probability density function of the data distribution, then removes all data points with a probability density estimate smaller than α. Thus, α controls how much data are removed in each iteration. This procedure was repeated until no new outliers were identified. Kernel density estimates were obtained using binned kernel approximations [48], which are suitable when a large number of observations are at hand. We tested five values of α and selected the optimal one by cross-validation (Table 1). Given the sheer number of positive and unlabelled pixels, we generated 25 smaller, more manageable training sets of 10,000 pixels (5000 pixels per class) using bootstrapping (iterative sampling with replacement). Positive data were randomly drawn. Unlabelled data were selected following a stratified sampling that maximised the spatial coverage of the feature space. A total of 3228 strata were defined by tessellation of the feature space in regular triangles. Each side of the triangular strata had a length of two per cent fractional cover. We then randomly selected 5000 samples with a maximum of two samples per stratum. Good balance between the positive and unlabelled classes has been shown to improve the performance of one-class classifiers [25].

Classification Method
One-class classifiers are well-established in remote sensing. They can be grouped in two groups: P-classifiers and PU-classifiers. P-classifiers use only positive data for training, PU-classifiers use both positive and unlabelled data. P-classifiers can lead to unreliable results when the training sample is less representative or when the positive data is insignificantly different from the other class. Therefore, we implemented biased Support Vector Machines (bSVM [49]), a PU-classifier adapted from binary Support Vector Machine [50]. Empirical evidence showed that bSVM performs better than other one-class classification methods [51].
The bSVM classfier transforms the original training set into a high dimension space and constructs a hyperplane that maximises the margin between two or more classes. The margin is the distance from the hyperplane to the closest element on either side. It also includes a regularisation parameter C, which controls the trade-off between maximising the margin and minimising the training error. The core concept of bSVM is to penalise classification errors differently depending on whether the misclassified data were labelled or not [49]. As a significant amount of the unlabelled set will contain positive data, it is relevant to penalise misclassifications of positive data more strongly than of unlabelled data. Thus, misclassification cost terms are defined for each class so that C + and C − are the regularisation for positive and unlabelled data, respectively. Our implementation used C x rather than C + , which is a multiplier used to define the penalty of positive class based on the regularisation term of the unlabelled class, i.e., C + = C x × C − .
We selected Gaussian radial basis functions as kernel for the bSVM models in order to create non-linear classifiers. The inverse kernel width (σ) as well as C − and C x were optimised using grid search (Table 1).

Model Selection
Models are generally selected based on accuracy measures which are derived from the error matrix ( Figure 5). In the context of binary or multi-class classification, an array of accuracy metrics is available [52]. The overall accuracy provides the proportion of pixels that were correctly classified: The true positive rate (TPR), also known as the producer's accuracy, is the proportion of correct conditions of positive cases (true positive) over the number of actual negative cases in the data: The precision, or the user's accuracy, indicates the probability of correctly detecting a positive case: The F-score is the harmonic mean of precision and true positive rate [53]: Precision is not available for one-class problems because only positive samples are known. This poses serious challenges because, if only the true positive rate can be computed, a classifier would naturally tend to classify all data as positive. Thus, specific accuracy metrics have been proposed to capture the specifics of one-class classification. A reliable one-class classifier minimises the number of unlabelled data classified as positive and maximises the amount of positive data that are correctly classified [54]. An oft-used metric is the probability of positive prediction (PPP) that estimates the ratio of pixels classified as positive: where the total number of observations is given by N+P. A second common metric is PU-performance (puF) which is related to the F-score and combines TPR and PPP [55]: It follows that the higher the puF, the more accurate the one-class classifier. Evidence showed that modifying the decision threshold is an inexpensive optimisation that can lead to significant improvement of performance measures [30,51,56]. The bSVM model produces continuous predictions and a threshold is needed to separate the positive and negative classes. By default, this threshold is 0. Piiroinen et al. [54] proposed an approach (herein the min.dist approach) to select the optimal threshold based on the Receiving Operating Characteristics (ROC) curve adapted for one-class problems and on the concept of Pareto dominance. The ROC curve was adapted for one-class classification problems so that TPR is the y-axis and PPP the x-axis. The rationale of the min.dist approach is that the most accurate classifiers have a high TPR and low PPP, i.e., they should be located as close as possible to the upper left corner of the ROC curve plot (TPR = 1 and PPP = 0). The classifier with the optimal threshold is thus the closest to the top-left corner in the TPR-PPP space.
We compared the puF and min.dist models. We tested 50 different thresholds for every combination of parameters to generate the min.dist candidate models. In total, we generated 5 million bSVM models (50 thresholds × 5 α values × 10 C − values × 8 C x values × 10 σ values × 25 repetitions). We then assessed the accuracy of the models selected by the two approaches for seven periods of 20 days ranging from 4 April to 2 September and 10 October to 20 January for the winter and summer seasons, respectively. Within each 20-day period, we only considered the predictions of the latest cloud-free observations for each pixel. We selected the best approach based on their overall accuracy and F-score. We assessed the statistical significance of their differences by means of t-tests.

Validating Seasonal Fallow Predictions
As for model calibration, ground truth data distributed along the growing season were lacking. Therefore, we validated the outputs of the bSVM models in two steps: first, we validated maps of the seasonal fallow extent and assessed how soon it could be reliably mapped; and second, we evaluated the accuracy of the dynamic fallow detection during the season using crop emergence data.
The accuracy of seasonal fallow extent was evaluated using an independent set of ground truth data for both the summer and winter seasons. Positive data were derived from bootstrapped subsets of the fallow polygons (579 polygons on average) and negative data were randomly selected from crop observations (405 polygons; Figure 4). Ground truth labels were then compared against those provided by the classifiers for images within a 20-day window around the field survey date so as to fill in gaps due to missing values. To assess performance at both the overall and class levels, we computed the overall accuracy (OA; Equation (1)), the true positive rate (TPR; Equation (2)), the precision (Equation (3)), and the F-score (Equation (4)) across the 25 bootstrapped samples. We compared the accuracy between every mapping period and survey date period using t-tests. Statistical significance was declared if differences in accuracy were > 5%. Note that a decline in accuracy does not automatically imply poor model performance but rather indicates that some late sown crops may not have been sufficiently developed to be remotely distinguished from fallows.

Building an Ensemble
Ensuring a low computational cost is essential for near real-time monitoring applications that need to run at scale. Thus, for inference, we combined the selected models (trained from the 25 training sets) into an ensemble. To that purpose, we applied the 25 bSVM models to every combination of cover fraction regularly spaced by 1% cover, which is the resolution of the retrieval algorithm. The class of each grid point was then determined by majority voting. The ensemble was a k-nearest neighbour (k-NN) algorithm with k set to 1. We benchmarked the CPU time of a single bSVM model against that of the k-NN ensemble. We reported the average CPU time of each approach to process 100 randomly-selected and cloud-free Sentinel-2 images and assessed their difference with a Wilcoxon signed rank test.
Our implementation ran in R version 3.5.0 and relied on the raster, caret, kernlab and oneClass packages. All computations took place on a Dell PowerEdge M630 system where each classification was allocated ten Dual 10 core Intel Xeon E5-2660 V3 processors running at 2.6 GHz with 25 MB cache and 10 GB of RAM.

Validating Near Real-Time Performances
We assessed the accuracy of the dynamic fallow detection by comparing reported and detected crop emergence dates at 71 locations across the study area for the 2017 winter season. For each location, the detected emergence date was defined as the first occurrence of the longest series of non-fallow detection. The most accurate model was selected based on the results of the first validation. The reported emergence date was corrected to account for the delay between emergence and the first actual satellite overpass. We evaluated the agreement between observed and detected crop emergence dates with the R 2 and RMSE.

Mapping the Seasonal Fallow Dynamics
To map the fallow dynamics for six sequential seasons from 2017 to 2019, we classified all images between January and February, and September and October. Then, we summarised all classifications in the period of interest using a mode algorithm to reduce the impact of missing values. We also calculated the fallowed and cropped areas in the period of interest and drew the dynamical flow of cropping from summer 2017 to winter 2019.

Model Selection: puF vs. min.dist
For each of the 25 bootstrapped samples, we trained 200,000 models and selected the best ones following the puF and min.dist criteria. The best models had the following parameters: σ = 10, C − = 5.1, C x = 2, α = 0.01 for puF and σ = 10, C − = 9.1, C x = 2, α = 0.01 and threshold = 0.938 for min.dist. To compare the best puF and min.dist models, we calculated the average overall accuracy and F-score based on the closest cloud-free observations to the survey dates (Table 2). Both methods achieved high overall accuracy (0.86-0.94) and F-score (0.88-0.94). Tuning the threshold of the decision boundary (min.dist) did not provide a significant increase in accuracy compared to the default decision threshold (puF). Therefore, we retained the bSVM models with the highest puF because they were less computationally demanding to train.

Accuracy of Seasonal Fallow Maps and Crop Emergence Detection
We applied the best puF models to fractional cover images for six 20-day periods preceding the survey to identify how early reliable maps of the fallow extent could be obtained. We then computed the corresponding overall accuracy, F-score, true positive rate, and precision ( Figure 6). In winter 2017, the overall accuracy, the precision and the F-score reached about 92% for early August and continued to grow up to 94% in early September. It is important to stress here that the majority "errors" are not actual errors but rather they indicate limitations of extending crop labels prior to crop emergence. The true positive rate did not drop below 99%, indicating that fallows were marginally missed. Accuracy differences were not significant from mid-August on (Wilcoxon tests; p-values > 0.05). Therefore, we concluded that the winter fallow extent could be determined as early as mid-August (or 2-4 months before harvest).
With the exception of a slight decrease in late January, similar trends were observed in the summer of 2018. The overall accuracy, the F-score and the precision reached about 90% in late December and decreased to 85% in late January. The true positive rate maintained high accuracy (99%) throughout the season. The drop of accuracy at the end of January ought to be related to the larger sowing windows for summer crops than for winter crops. For instance, sorghum can be sown from early September to late February. Harvests of early-sown sorghum can occur while late-sown sorghum is still maturing. Differences in accuracy were significant until mid-December. Therefore, we concluded that the summer fallow extent could be mapped from mid-December onwards (or 1-4 months before harvest). Figure 6. Accuracy of the seasonal fallow extent predictions over time. NS indicates non-significant differences while *, ** and *** indicate significant differences at the 95%, 99% and 99.9% confidence levels.
The median run time for a single bSVM for processing a cloud-free Sentinel-2 image was 2.7 h (i.e., 67.5 h are necessary for all 25 models of the ensemble to complete). In comparison, the k-NN ensemble ran in 2.5 h, which was a significant reduction in run time (p-value = 0.0450). The ensemble successfully integrated the 25 bSVM models for less than the computational cost of a single bSVM model.
To validate the near real-time detection performance of the ensemble, we compared predicted crop emergence dates against reported ones (Figure 7). The regression had an R 2 of 0.57 and an RMSE of 4.8 days, which underlines the model's ability to detect land-use changes in near real-time.

Mapping the Seasonal Fallow Extent and the Cropping Dynamics
We mapped the seasonal fallow extent for six consecutive seasons from summer 20107 to winter 2019 using the ensemble. Then, we calculated the flow of pixels between the fallow and the cropping pools ( Figure 8). Estimates suggested that 20% of the cropland area had a cropping intensity of 2, 30% of 1.5, 23% of 1, 11% of 0.5, and 16% of 0. Pixels with a cropping intensity of 0 were largely found in the western fringe of the cropping area where rainfall is less reliable. The seasonal fallow extent ranges between 50% in winter 2017 (3.3 million ha) and 85% in winter 2019 (5.6 million ha), which underlines the importance of fallowing in the study area. Due to dry conditions across the study area, the cropped area kept on decreasing since record high winter-crop production was achieved in 2016-2017 [57]. The large extent of winter fallows in 2018 and 2019 is consistent with official reports [57,58]. It is related to a lack of sowing opportunities or failed crop establishment because of drought conditions across south-east Queensland and northern New South Wales.
The fallow extent maps captured spatial and temporal patterns in agreement with domain knowledge (Figure 9a). For instance, most fallows were located on the western fringe of the cropping area, which is, on average, drier (500-550 mm of annual rainfall) than the eastern part of the study area (550-750 mm). Fields near dams were largely fallowed before being sown with irrigated cotton crop, which is more profitable (Figure 9b inset A). The average bare soil fraction, photosynthetic fraction and non-photosynthetic fraction were 55%, 10%, and 35%, respectively.

Discussion
In this paper, we introduced a new method to estimate the location and extent of fallows across the northern grains region of Australia based on one-class classifiers and fractional cover data. Our method can use single fractional cover images derived from Sentinel-2 at any point in time without re-calibration. The final classifier is an ensemble of 25 bSVM base models with low computing costs and is, therefore, well-suited for repeated, large-scale applications.
The level of accuracy of our method seems to indicate that it is fit for several purposes. First, accurate estimates of the seasonal fallow extent could be obtained by mid-August and mid-December for winter season and summer season respectively (1-4 months before harvest). They can inform early-season estimates of the area cropped, well ahead of the optimal window for crop identification. For instance, the best temporal window for single-date classification of summer crops in south-eastern Australia was February to mid-March [59]. Such timely information would be particularly valuable during drought-stricken years. Second, near real-time information about the extent and location of fallows combined with pixel-level photosynthetic vegetation cover fractions can provide insights about the weed populations that need to be controlled, and therefore are a proxy for herbicide demand. In the region of interest, herbicides represent one of the largest costs to grain growers [60]; not controlling weeds results in a direct loss of yield potential. Third, the model's ability to detect crop emergence seems promising. Recent attempts to identify sowing dates have leveraged hyper-temporal and hyper-spatial resolutions of CubeSats (e.g., [61]) but these can hardly be scaled due to the cost of such data. While more work is needed to deal with one-off false negatives (false detection of crops), deriving crop emergence from time series of fallow maps is an interesting alternative to most time series approaches which cannot provide timely estimates of crop emergence. Overall, our approach seems fit to serve a range of applications in the study area. We expect that it would perform equally well in other regions of the Australian wheat belt as well as in other cropping systems, where rainfall is the main factor limiting crop production.
One of the benefits of our method is that it reduces the data collection effort to a single roadside survey during the flowering period, which is similar to the surveys used for cropland or crop type mapping [46,62]. We then extended the labels observed around flowering back in time (beyond crop emergence) based on assumptions supported by the knowledge of local fallow management practices. The net effect was a more comprehensive class description, accompanied by a dramatic increase of the number of pixels available for training. We reduced the sample size by randomly sampling 5000 pixels per class. Each bSVM model was trained with data from around 550 fields labelled as fallows and 400 unlabelled crop fields. If the latter are not available, they can be obtained by sampling randomly within the cropland area. The training set size might have been further reduced with minimal impact on the classification accuracy by using more advanced data selection methods are likely to perform [21][22][23]. The ability to lower data collection requirements is advantageous for large-scale applications.
We reduced sensor dependence by choosing cover fractions instead of spectral data. This entails that our model can be ported to other sensors or can integrate data from multiple sensors to coverage in cloud-prone areas. In fact, the unmixing model used was first developed from Landsat. As a result, our approach can make use of the whole archive of Landsat fractional cover products to highlight changes in the fallow patterns or reconstruct spatially the trend towards early sowing [63]. Both in hindcast and nowcast, our model enables several interesting cross-platform national-scale applications. It could readily be deployed on the Australian Geoscience Data Cube [64], which provides timely processing of big earth observation data through its integration within the high-performance computing environment provided by the Australian National Computational Infrastructure.
The fundamental goal of this works was to develop a method for operational fallow monitoring as well a data for future research on cropping dynamics in Australia. This study advanced the capabilities for near real-time monitoring of agriculture in Australia with little data requirements. The general lack of field data over space and time, which makes training and validation difficult, also speaks to the unique value brought in by one-class classification methods. This study also suggests several options for further improvement. Foremost among these is to reduce the uncertainty of the unmixing method (which retrieves cover fractions with an average accuracy of 11.6%) with multi-output machine learning regression models (see [65] for a review). Improvements will propagate and directly boost the accuracy of fallow detection when crop cover is low. Second, accuracy could further be improved by integrating classification outputs (and possibly domain knowledge) across time so as to reduce the occurrence of one-off detections. Third, fallow mapping will benefit from improved cropland masking. It is interesting to note that our method could be used to refine cropland maps. Applied over multiple years, it could identify areas consistently not cropped and areas with permanent vegetation cover and thus progressively improve the cropland map. Fourth, cover fractions could be used to establish a biophysical typology of fallows based for instance on crop residues. This would pave the way to studying issues such as crop residues and their impacts on soil erosion and soil organic carbon [66] or mapping the shift to agricultural systems involving no-tillage across Australia [67]. Finally, future work is needed to evaluate if harvest dates could be detected.

Conclusions
We developed a new method to monitor the fallow dynamics in near real-time. Our method combines one-class classifiers (biased support vector machines) with fractional cover images derived from Sentinel-2. One-class classifiers allowed us, based on domain knowledge, to extend the temporal coverage of in-situ observations collected around flowering in order to generate large training data sets, thereby reducing the data collection requirements. Fractional cover images intuitively describe fallows in biophysical terms and increase sensor independence. In the northern grains zone of Australia, the seasonal fallow extent was identified with an average accuracy >92% in both summer and winter. We also showed that reliable maps of the seasonal fallow extent could be obtained as early as mid-August for the winter season (2-4 months before harvest) and mid-December for the summer season (1-4 months before harvest). The method's ability to identify fallows in near real-time was verified using crop emergence data (R 2 = 0.57). The final classifier was a simple ensemble of 25 one-class classifiers, which significantly reduced computational costs so that inference across large areas is fast. Our method can readily be transferred to other sensors and implemented on the Australian Geoscience Data Cube. This paper provides a clear pathway towards operational monitoring of the fallow and cropland dynamics across Australia.