Detecting Winter Cover Crops and Crop Residues in the Midwest US Using Machine Learning Classiﬁcation of Thermal and Optical Imagery

: Cover crops are an increasingly popular practice to improve agroecosystem resilience to climate change, pests, and other stressors. Despite their importance for climate mitigation and soil health, there remains an urgent need for methods that link winter cover crops with regional-scale climate mitigation and adaptation potential. Remote sensing is ideally suited to provide these linkages, yet, cover cropping has not been analyzed extensively in remote sensing research. Methods used for remote sensing of crops from satellites traditionally leverage the difference between visible and near-infrared reﬂectance to isolate the signal of photosynthetically active vegetation. However, using traditional greenness indices like the Normalized Difference Vegetation Index (NDVI) for remotely sensing winter vegetation, such as winter cover crops, is challenging because vegetation reﬂectance signals are often confounded with reﬂectance of bare soil and crop residues. Here, we present new and established methods of detecting winter cover crops using remote sensing observations. We ﬁnd that remote sensing methods that incorporate thermal data in addition to traditional reﬂectance metrics are best able to distinguish between winter farm management practices. We conclude by addressing the potential of existing and upcoming hyperspectral and thermal missions to further assess agroecosystem function in the context of global change. ﬁeld validation data from fall windshield collected A team of surveyors presence presence,


Introduction
As one of the most intensively managed landscapes worldwide, agroecosystems provide a crucial opportunity to mitigate greenhouse gas emissions at a global scale [1]. One of the most promising climate mitigation and adaptation interventions in agroecosystems is the practice of cover cropping [2], where farmers plant a non-commercial crop during fallow periods when the soil is typically bare between cash crop harvest and the following season's planting. Cover cropping, alongside zero tillage and mulching, represent a set of practices that are jointly referred to as conservation agriculture, which is practiced by farmers worldwide [3]. In temperate climates, such as the U.S. Midwest (also known as 'the Corn Belt'), cover cropping occurs during winter when growing conditions for vegetation is minimized. Winter cover crops are crucial to climate mitigation because of their potential to dramatically increase soil carbon sequestration [2]. In addition, they offer important climate adaptation benefits to help farmers adapt to prolonged drought and more intense precipitation events [4,5]. Even though cover crops' mitigation and adaptation benefits are touted as a win-win opportunity (e.g., [6][7][8]), research to detect and analyze the effects of winter cover crops at large spatial and temporal scales remains in the early stages.
Large-scale analysis of cover crops requires improving upon current multispectral remote sensing techniques to extend our existing understanding of cover crops gained from field-scale research. Cover crops have not been analyzed extensively in remote sensing  Prabhakara et al. (2015) compared relationships between 10 different visible range and near infrared (NIR) indices and percent ground cover and biomass of winter cover crops [10]. They found that NDVI and the triangular vegetation index (TVI), which is the hypothetical triangle created by green peak reflectance, red minimum chlorophyll absorption bands, and the NIR edge, provided the most accurate estimates of cover crop biomass and percent cover ( Table 1). One of the difficulties in relying solely on traditional greenness indices for detecting cover crops is that they are intermixed in agricultural fields alongside bare soils and crop residues, which are the dormant and decaying remains of harvested crops left in the field. Traditional greenness indices have difficulty distinguishing between bare soils and crop residues (Figure 1, [11]). Furthermore, cover crops also become dormant with cold winter temperatures or can be seeding after dormancy, emerging only in the spring [18]. Recently, time series methods to identify cover crops based on phenology from multiple sensors have successfully predicted cover crop and cover crop residue cover over large spatial scales [19] and cover crop end of season [15].
Using wavelengths beyond those in the visible and NIR portions of the electromagnetic spectrum is a promising strategy to differentiate between non-photosynthetically active vegetation (i.e., crop residues) and bare soils in agroecosystems. By incorporating information from wavelengths in the shortwave infrared (SWIR), cellulose and lignin absorption features of crop residues can be detected and used to differentiate them from bare soils ( [11,[20][21][22]; Table 1). Hively et al. (2018) compared the performance of the normalized difference tillage index (NDTI) index from Landsat observations and six WorldView-3 SWIR indices to map the spatial pattern of crop residues [23]. Their results  Prabhakara et al. (2015) compared relationships between 10 different visible range and near infrared (NIR) indices and percent ground cover and biomass of winter cover crops [10]. They found that NDVI and the triangular vegetation index (TVI), which is the hypothetical triangle created by green peak reflectance, red minimum chlorophyll absorption bands, and the NIR edge, provided the most accurate estimates of cover crop biomass and percent cover ( Table 1). One of the difficulties in relying solely on traditional greenness indices for detecting cover crops is that they are intermixed in agricultural fields alongside bare soils and crop residues, which are the dormant and decaying remains of harvested crops left in the field. Traditional greenness indices have difficulty distinguishing between bare soils and crop residues (Figure 1, [11]). Furthermore, cover crops also become dormant with cold winter temperatures or can be seeding after dormancy, emerging only in the spring [18]. Recently, time series methods to identify cover crops based on phenology from multiple sensors have successfully predicted cover crop and cover crop residue cover over large spatial scales [19] and cover crop end of season [15].
Using wavelengths beyond those in the visible and NIR portions of the electromagnetic spectrum is a promising strategy to differentiate between non-photosynthetically active vegetation (i.e., crop residues) and bare soils in agroecosystems. By incorporating information from wavelengths in the shortwave infrared (SWIR), cellulose and lignin absorption features of crop residues can be detected and used to differentiate them from bare soils ( [11,[20][21][22]; Table 1). Hively et al. (2018) compared the performance of the normalized difference tillage index (NDTI) index from Landsat observations and six WorldView-3 SWIR indices to map the spatial pattern of crop residues [23]. Their results suggested that two SWIR indices of Lignin Cellulose Absorption Index (LCA) and Shortwave Infrared Normalized Difference Residue Index (SINDRI) performed most accurately (Table 1). Similarly, Sonmez and Slater (2016) found that SWIR-based Cellulose Absorption Index detected crop residue and tillage practices better than three vegetation indices [21]. Rather than estimating percent of ground cover or biomass of cover crops, machine learning methods can predict the presence of cover crops anywhere on the landscape using only remotely sensed data. Seifert et al. 2018 used a random forest classification to incorporate the NDVI and SWIR bands along with additional parameters, such as elapsed days, to classify winter cover crops [24]. We note that the term "Growing Degree Days" generally incorporate a temperature threshold calculation, and henceforth refer to GDD as defined by Seifert et al. (2018) as "Elapsed Days". These promising efforts indicate that remote sensing can be an effective approach to analyze cover cropping and related conservation agriculture practices at regional scales. However, most studies have still largely limited their approaches to smaller geographic areas, and few have taken advantage of the potential of machine learning models to explicitly predict cover crop presence in agroecosystems. Additionally, applications of thermal remote sensing in agriculture have been limited compared to those using optical remote sensing data [28]. Thermal remote sensing provides information about temperatures on the earth's surface, which is linked to landscape-level patterns and processes like vegetation cover [29], plant activity [30], and vegetation water stress [31,32]. While traditionally challenging to process at fine scales, recent developments including the Landsat Provisional Surface Temperature (ST) product [33], provide a key opportunity to test the potential benefits of thermal data for prediction of agroecosystem practices. A preliminary exploration of the differences in ST between winter cover crops, crop residues, and bare soils indicates consistent temperature differences in the fall that could be leveraged by machine learning classification algorithms ( Figure 2).
Our ability to monitor cover crop adoption and predict the benefits of widespread cover crop use as a potential climate mitigation strategy depends on our ability to accurately detect cover crops across regional scales. In this paper we evaluate the predictive capacity of established and novel remote-sensing-based metrics for detecting cover crops using a series of machine learning models. The goal of this work is to assess which regions of the electromagnetic spectrum and indices are most effective for detecting cover crops and crop residues. We developed two suites of machine learning models: a two-class model to predict the presence or absence of winter cover crops, and a three-class model to classify winter cover crops, crop residues, and bare soils. The model inputs increase in complexity from traditional greenness indices, to adding SWIR and then thermal data with the Landsat Remote Sens. 2021, 13,1998 4 of 17 ST product. We anticipate that the addition of thermal data will improve detection of cover crops because cover crops are generally warmer than bare soil in the autumn ( Figure 2). We also anticipate that adding SWIR will improve model ability to distinguish between cover crops and crop residues. Our overarching goal is to develop a comprehensive methodology to quantify observed and potential benefits of winter cover crop adoption at spatial and temporal scales relevant for land management. Our ability to monitor cover crop adoption and predict the benefits of widespread cover crop use as a potential climate mitigation strategy depends on our ability to accurately detect cover crops across regional scales. In this paper we evaluate the predictive capacity of established and novel remote-sensing-based metrics for detecting cover crops using a series of machine learning models. The goal of this work is to assess which regions of the electromagnetic spectrum and indices are most effective for detecting cover crops and crop residues. We developed two suites of machine learning models: a two-class model to predict the presence or absence of winter cover crops, and a three-class model to classify winter cover crops, crop residues, and bare soils. The model inputs increase in complexity from traditional greenness indices, to adding SWIR and then thermal data with the Landsat ST product. We anticipate that the addition of thermal data will improve detection of cover crops because cover crops are generally warmer than bare soil in the autumn ( Figure 2). We also anticipate that adding SWIR will improve model ability to distinguish between cover crops and crop residues. Our overarching goal is to develop a comprehensive methodology to quantify observed and potential benefits of winter cover crop adoption at spatial and temporal scales relevant for land management.

Materials and Methods
We obtained Landsat Operational Land Imager (OLI) and TIRS (Thermal Infrared Sensor) observations and the Landsat Provisional ST product for the 2015-2016 growing season and calculated vegetation-relevant metrics in the visible, near-infrared, shortwave infrared, and thermal regions. Using these remotely sensed input data and ground-truth information from windshield surveys (Section 2.1), we created a series of random forest models to classify (1) the presence/absence of cover crops; and (2) distinguish between cover crops, crop residues, and bare soil/conventional tillage. We then built models with increasing complexity to assess the additional value of each wavelength region beyond simple greenness (NDVI). Remote sensing classification models were assessed using both model accuracy and Cohen's Kappa.

Materials and Methods
We obtained Landsat Operational Land Imager (OLI) and TIRS (Thermal Infrared Sensor) observations and the Landsat Provisional ST product for the 2015-2016 growing season and calculated vegetation-relevant metrics in the visible, near-infrared, shortwave infrared, and thermal regions. Using these remotely sensed input data and ground-truth information from windshield surveys (Section 2.1), we created a series of random forest models to classify (1) the presence/absence of cover crops; and (2) distinguish between cover crops, crop residues, and bare soil/conventional tillage. We then built models with increasing complexity to assess the additional value of each wavelength region beyond simple greenness (NDVI). Remote sensing classification models were assessed using both model accuracy and Cohen's Kappa.

Survey Data
Indiana is in the Midwestern United States, a temperate region that produces about one-quarter of the world's annual soybean supply and one-third of the world's annual maize supply [34]. There are over 127 million acres of agricultural land in the Midwest, 75% of which is used for corn and soybeans, and the remaining 25% is used to produce a variety of crops including alfalfa, tobacco, and wheat [35]. The most popular cover crops in Indiana are fall-seeded cereals like rye and wheat [36]. Cash crops are typically planted in mid-April to early May, and harvested in mid-September to mid-October.
Our study draws on field validation data from fall windshield surveys collected by the U.S. Department of Agriculture's Natural Resources Conservation Service (NRCS) across five counties in Indiana, USA. A team of surveyors drives along prescribed transects and manually records cover crop presence and species, crop residue presence, and prior cash Remote Sens. 2021, 13, 1998 5 of 17 crop, among other metrics [37]. We obtained the survey transect data from five selected counties in Indiana (Figure 3), resulting in data from a total of 1262 ground-truthing fields. We aggregated the species-specific cover crop observations into presence/absence data. The counties differed in their rates of cover cropping, but the number of transect points documenting the presence of cover crops within each county were generally representative of that county's cover cropping rate. (Table 2). Survey transect data were processed in ESRI ArcGIS version 10.1. Specifically, 120 m buffers (polygons) were created around all 1262 GPS transect points, avoiding roads, ditches, and other non-agricultural land cover areas. We chose 120 m to keep the polygons as close to the roadside points as possible and to maintain proximity to the transect points, while permitting enough pixels to represent the ground. The average spectral signature for each band within a 40 m buffer around the 120 m buffer centroid was used to calculate band metrics in Table 2. These data were then used to train our random forest classification models.
in mid-April to early May, and harvested in mid-September to mid-October.
Our study draws on field validation data from fall windshield surveys collected b the U.S. Department of Agriculture's Natural Resources Conservation Service (NRCS across five counties in Indiana, USA. A team of surveyors drives along prescribed tran sects and manually records cover crop presence and species, crop residue presence, an prior cash crop, among other metrics [37]. We obtained the survey transect data from fiv selected counties in Indiana (Figure 3), resulting in data from a total of 1262 ground-truth ing fields. We aggregated the species-specific cover crop observations into presence/ab sence data. The counties differed in their rates of cover cropping, but the number of tran sect points documenting the presence of cover crops within each county were generall representative of that county's cover cropping rate. (Table 2). Survey transect data wer processed in ESRI ArcGIS version 10.1. Specifically, 120 m buffers (polygons) were create around all 1262 GPS transect points, avoiding roads, ditches, and other non-agricultura land cover areas. We chose 120 m to keep the polygons as close to the roadside points a possible and to maintain proximity to the transect points, while permitting enough pixel to represent the ground. The average spectral signature for each band within a 40 m buffe around the 120 m buffer centroid was used to calculate band metrics in Table 2. Thes data were then used to train our random forest classification models.   We obtained HLS2 (Harmonized Landsat Sentinel 2 Data) BRDF-adjusted (Bidirectional Reflectance Distribution Function) L30 (Landsat-8 OLI harmonized surface reflectance resampled at 30 m over the Sentinel-2 tiling system) for all of Indiana [38] for the study year (2015-2016 growing season). We used the HLS product to facilitate future incorporation of the HLS S30 (Sentinel-2 30 m) product, which was not used in the present study. The study area required the download of 12 tiles: 16SDH, 16SEH, 16SFH, 16SDJ, 16SEJ, 16SFJ, 16TFK, 16TEK, 16TDK, 16TDL, 16TEL, and 16TFL. Data were reprojected using the 'gdalwarp' command in the 'rgdal' package, version 1.5-18 [39] in R [40]. For all scenes, we created a cloud mask based on the corresponding Quality Assessment (QA) layer that filtered out pixels with Cirrus (bit 0), Cloud (bit 1), or Adjacent Cloud (bit 2) according to the recommendations in the HLS Product user's guide Version 1.4 [41]. Data were compiled to represent the hydrologic year rather than the calendar year; the 2016 water year runs from 1 October 2015 through 30 September 2016.

Landsat Analysis-Ready Data
The Landsat ST product and associated Quality Assessment (QA) layers were obtained from the U.S. Landsat Analysis Ready Data product bundle. Only surface temperature estimates from the Landsat 8 thermal imager were used in this analysis for continuity with the OLI observations. The ST data are produced at 100 m resolution every 16 days. Broadly, the surface temperature algorithm uses Level-1 thermal infrared bands from the Thermal InfraRed Sensor (TIRS), along with other ancillary inputs, to estimate surface temperature in Kelvin for the terrestrial land surface. More algorithm details can be found in [33]. For all scenes, we filtered out pixels with high uncertainty in the ST based on the ST QA layer (>70%). Data were compiled to represent the hydrologic year and converted to degrees Celsius.

Band Math for Random Forest Inputs
We applied calculations to the L30 Bands and ST to obtain the metrics described in Table 3. We transformed Landsat bands to calculate both well-established and novel band metrics that we used as input data for the random forest classification models. We obtained NDVI using the standard Landsat 8 formulation where Bands 5 and 4 represent the NIR and Red bands, respectively. Further calculations performed on NDVI timeseries are described in  We used a machine learning classification approach in this study to assess cover crop presence on the landscape and distinguish cover crops from crop residues and bare soils. Machine learning is a powerful technique that allows an algorithm to 'learn' relationships between output classes (e.g., cover crops vs. bare soil) and input features (e.g., remotely sensed metrics) to classify novel input data. We used random forest, which is a supervised learning algorithm. For classification problems, random forest constructs an ensemble of decision trees based on input features during training, then aggregates votes from individual trees to determine the final predicted class. One strength of random forest (as opposed to other machine learning algorithms) is that the models are interpretable, in that the trained model can be used to gain insight into the relationship between features and classes using variable importance metrics [44].
Here, we developed both two-and three-class random forest models to predict winter land cover and land use practices in the U.S. state of Indiana. The two-class model was trained to predict the presence or absence of winter cover crops across the landscape. The three-class model aimed to distinguish between bare soil (conventional tillage), crop residues, and winter cover crops. To assess the value of adding observations and indices from additional regions of the electromagnetic spectrum, four iterations of each model were run with an increasingly complex set of input parameters. The four levels of complexity were as follows:

•
First level: NDVI-we used the maximum of fall (1 October-1 December) NDVI observations for each pixel.
• Second level: VisNIR-we used bands and indices from the visible and near infrared regions of the spectrum (see Table 2), in addition to the NDVI from the first level. • Third level: SWIR-we added two tillage indices, based on the shortwave infrared bands (see Equations (2) and (3) to the input datasets in the third level of complexity, in addition to all metrics in levels 1 and 2. • Fourth level: ST (Thermal)-we added the median of fall observations (October-1 December) and the absolute maximum ST from the Landsat provisional surface temperature product to the input datasets in this fourth and final level of complexity, in addition to all metrics in levels 1, 2, and 3.
Random forest models were created using the 'caret' package in R [45]. Because the classes were highly imbalanced (i.e., there were many more non-cover crop points than points with winter cover crops), we up-sampled the cover crop points in the training dataset to achieve balanced classes for model training. Specifically, the minority class (cover crops) was sampled with replacement to make the class distributions equal. In random forest model development, we conducted repeated cross-validation of 80% of the training data with 10 folds and 10 repeats. The number of trees (ntree) was set at 500 for all models. We also created a grid of hyper-parameter values that allowed the 'mtry' metric, which is the number of variables available for splitting at each tree node, to vary from 1 to 16 (the total number of variables in our input dataset). The models presented in Table 4 are models with the highest accuracy and kappa. Table 4. Training accuracy of cover crop detection algorithms when based on increasingly extensive range of wavelengths in the input datasets (total n = 989). We report the overall model accuracy, the value of the kappa statistic for the cover crop class, and the optimal mtry value used for the final model.

Model
First

Model Validation
In addition to reporting model statistics on the 20% of the validation data left out during model building, we also applied the models to all of Posey County Indiana and evaluated model performance against the transect data. The latter is a more difficult classification task for the machine learning algorithm because it must implement the algorithm to classify every pixel across the county. The 'best' models were selected based on both accuracy and Cohen's Kappa statistic: where P o is the relative observed agreement, and P e is the expected agreement. There is no standard scale for interpreting kappa values. The higher the kappa value, the better the match between machine learning classification results and the ground truth data.
Here where true positive (TP); true negative (TN); false positive (FP); and false negative (FN) are all possibilities for the relationship between the predicted condition (modeled class) and true condition (ground truth class). A 'false positive' is also known as a 'Type I statistical error' and 'false negative' is known as a 'Type II statistical error'. Because it is more important for our cover crop detection model to detect the presence of cover crops, rather than detect the presence of bare soil/conventional tillage, we also report the Positive Class Accuracy Positive Class Accuracy = TP (TP + FN)

Within-Model Training Accuracy Model Results
Overall, models that included thermal and surface temperature data were only slight improvements over the simpler greenness-based models ( Table 4). For the two class models, the fourth-level ST model, SWIR, and VisNIR model had equivalent kappa values, indicating the more complex model provided marginal benefit. For the three-class model, the addition of the more complicated VisNIR metrics and SWIR band increased model accuracy and kappa (from 72.1% accuracy to 78.9% accuracy and k = 0.58 to k = 0.68, with a slight added benefit of the additional thermal data in terms of model accuracy (79.7% accuracy) and kappa (k = 0.70). Overall, the overall training accuracy of the model was excellent. Next, we tested model strength more rigorously by applying the random forest model to spatially continuous input rasters. Such predictions on new data are more representative of real-world applications of our model than training accuracy.

Model Testing Results for Novel Predictions in Posey and Gibson Counties
Overall, models that included ST and thermal data were the best models in terms of both accuracy and kappa for both the two-and three-class models (Table 5). For the two-class model, the fourth level ST model was the best predictor of cover crop presence, with an overall accuracy of 89.4% and k = 0.72, indicating our model has 'substantial' predictive value according to Landis and Koch's rubric. For the three-class model, the fourth level ST model also performed best, with an overall accuracy of 79.8% and k = 0.69. The fall maximum NDVI-only models performed poorly, with only k = 0.2 and k = 0.12 for the two-and three-class models, respectively (Table 5). However, the biggest jump in improvement for both two and three class models was from the addition of more complicated NDVI-based metrics (Table 5) between the first and second level models. Table 5. Classification accuracy of cover crop detection algorithms when based on increasingly extensive range of wavelengths in the input datasets. Accuracy metrics were obtained by evaluating the accuracy of predictions on windshield surveys in Posey County (n = 397). We report the overall model accuracy, and the value of the kappa statistic for the cover crop class.

Model
First Level: NDVI Second Level: VisNIR Confusion matrices for the fourth-level Thermal ST models for the two-and three-class models are presented below. Positive class accuracy for the two-class model was 80.2%, indicating that the model generally predicted cover crops correctly (Table 6). Positive class accuracy in the three-class model was 72.8% for cover crops and 88.6% for residues, indicating the model was not as adept at accurately predicting cover crops as crop residues (Table 7). Variable importance scores for both the two-and three-class models showed that surface temperature was the most important predictive variable for both models (Table 8). For the two-class model that predicted cover crop presence/absence, the standardized tillage index was the second most important, with a variable importance score of 68.9, followed by the annual maximum value of band B10 at 57.4 (Table 8). For the three-class model that predicted crop residues and cover crop presence, the STI and NDTI were the second and third most important variables, with variable importance scores of 82.8 and 74.0, respectively. (Table 8). Notably, 'elapsed days' were not highly important in either model, with scores of 24.6 and 15.6 for the two-and three-class models, respectively. Variability in ST and variability in fall maximum NDVI were not strongly coupled, (Figure 4a), underscoring the limited predictive power of NDVI in our classification models. There was stronger coupling between ST and STI, particularly for the cover crop class (Figure 4b). This suggests that the relationship between ST and STI may be useful in identifying cover crops. Table 8. Variable importance scores for the two-class cover crop presence/absence and three-class cover crop/residue/bare soil model, ranked from most to least important. All metrics are the median fall observations (1 October-1 December) unless otherwise noted. More information on specific metrics can be found in Table 2. Displayed variable importance scores are scaled mean decrease accuracy (%IncMSE divided by its standard deviation).  Table 8. Variable importance scores for the two-class cover crop presence/absence and three-class cover crop/residue/bare soil model, ranked from most to least important. All metrics are the median fall observations (1 October-1 December) unless otherwise noted. More information on specific metrics can be found in Table 2. Displayed variable importance scores are scaled mean decrease accuracy (%IncMSE divided by its standard deviation). Despite high positive-class accuracy, the two-class model indicated a lower percentage of cover crops over Posey County than estimates from the Indiana Conservation Partnership ( Figure 5). Cover crop presence is 24.8% in the two-class model, and 30.0% in the three-class model (Figure 5a,b). In the three-class model, there was also 25.0% crop residue coverage. Corresponding input variables are also plotted alongside random forest model results (Figure 5c-  Variable importance scores for both the two-and three-class models showed that surface temperature was the most important predictive variable for both models (Table 8). For the two-class model that predicted cover crop presence/absence, the standardized tillage index was the second most important, with a variable importance score of 68.9, followed by the annual maximum value of band B10 at 57.4 (Table 8). For the three-class model that predicted crop residues and cover crop presence, the STI and NDTI were the second and third most important variables, with variable importance scores of 82.8 and 74.0, respectively. (Table 8). Notably, 'elapsed days' were not highly important in either model, with scores of 24.6 and 15.6 for the two-and three-class models, respectively. Variability in ST and variability in fall maximum NDVI were not strongly coupled, (Figure 4a), underscoring the limited predictive power of NDVI in our classification models. There was stronger coupling between ST and STI, particularly for the cover crop class (Figure 4b). This suggests that the relationship between ST and STI may be useful in identifying cover crops.  Despite high positive-class accuracy, the two-class model indicated a lower percentage of cover crops over Posey County than estimates from the Indiana Conservation Partnership ( Figure 5). Cover crop presence is 24.8% in the two-class model, and 30.0% in the three-class model (Figure 5a,b). In the three-class model, there was also 25.0% crop residue coverage.

Two-Class
Corresponding input variables are also plotted alongside random forest model results (Figure 5c-e).

Discussion
Our study evaluated the capacity of indices in the visible, NIR, SWIR, and thermal regions to predict winter cover crops and crop residues over the state of Indiana. Overall, we found that NDVI and metrics based solely on visible and NIR were not sufficient to detect cover crops or crop residues in Midwest agroecosystems. Our approach relied on comparing models with varying degrees of complexity by starting with NDVI, then adding a suite of indices from the visible and near-infrared, then incorporating SWIR bands, and finally incorporating thermal surface temperature data. For both our two-and threeclass model, we found that a more complex model that included ST yielded the most accurate predictions of both cover crops and crop residues. Our work supports findings elsewhere in the literature [21,23,24] that beyond greenness-that is, going beyond the visible range and infrared bands-is essential to improve the detection of winter cover crops.
Each region of the electromagnetic spectrum tested here-VisNIR, SWIR, and thermal-is related to different characteristics of vegetation and thus each provides unique information for detection. The visible and near-infrared wavelengths are proxies for leaf area index and photosynthetic activity [47,48]. Wavelengths in the SWIR target the water, cellulose, and lignin in non-photosynthetically active vegetation to distinguish it from bare soils [49,50]. ST, which until now has not been used to detect winter agricultural practices at regional scales, is a variable with multifaceted connections to vegetation activity that incorporates the impacts of both vegetation structure (i.e., surface roughness; [51]) and function (i.e., transpiration; [52]).

Discussion
Our study evaluated the capacity of indices in the visible, NIR, SWIR, and thermal regions to predict winter cover crops and crop residues over the state of Indiana. Overall, we found that NDVI and metrics based solely on visible and NIR were not sufficient to detect cover crops or crop residues in Midwest agroecosystems. Our approach relied on comparing models with varying degrees of complexity by starting with NDVI, then adding a suite of indices from the visible and near-infrared, then incorporating SWIR bands, and finally incorporating thermal surface temperature data. For both our two-and three-class model, we found that a more complex model that included ST yielded the most accurate predictions of both cover crops and crop residues. Our work supports findings elsewhere in the literature [21,23,24] that beyond greenness-that is, going beyond the visible range and infrared bands-is essential to improve the detection of winter cover crops.
Each region of the electromagnetic spectrum tested here-VisNIR, SWIR, and thermal-is related to different characteristics of vegetation and thus each provides unique information for detection. The visible and near-infrared wavelengths are proxies for leaf area index and photosynthetic activity [47,48]. Wavelengths in the SWIR target the water, cellulose, and lignin in non-photosynthetically active vegetation to distinguish it from bare soils [49,50]. ST, which until now has not been used to detect winter agricultural practices at regional scales, is a variable with multifaceted connections to vegetation activity that incorporates the impacts of both vegetation structure (i.e., surface roughness; [51]) and function (i.e., transpiration; [52]).
A crucial finding of our study is that going beyond greenness meaningfully improved model ability to predict cover crop and crop residue presence on the landscape. Overall, remote sensing of vegetation based on the traditional multispectral 'greenness' paradigm is not sufficient to detect cover crops at spatial and temporal scales relevant to land use and land cover change. The NDVI-only based models had poor predictive power (k ≤ 0.2; Table 5). Random forest models based on a suite of indices in the visible and near-infrared improved model performance, likely by leveraging variables relevant to both fall vegetation cover (i.e., Fall NDVI) and growing season crop performance (i.e., elapsed days, [24]). Indices in the SWIR were more important in the three-class vegetation model than the two-class model, likely because the three-class model had a specific category for crop residues which are most distinguishable by wavelengths in the SWIR (Table 8). Even so, the SWIR index STI was the second most important predictive variable in the two-class model (Table 8). Finally, the addition of ST led to the best models for both the two-and three-class models ( Table 5).
A second key finding from our study is that incorporating thermal data alongside SWIR resulted in the best detection model. Although there is little difference in ST between bare soil, crop residues, and cover crops in the dormant season, there are consistent differences between land cover types in the fall that can be leveraged by detection algorithms ( Figure 2). Thermal data improved detection accuracy over the SWIR model for the two-class model, increasing kappa from 0.67 to 0.72 (Table 5). For the three-class model, thermal data also led to increased predictive power, increasing kappa from 0.61 to 0.69 (Table 5). While this improvement may seem small, incorporating thermal data offers several important advantages that should be treated as crucial to future detection. As temperatures drop in the fall, bare soil is the coldest, while crop residues that blanket the land surface keep the land surface relatively warm (Figure 2). Cover crops are in the middle, with slightly warmer temperatures than bare soil [53], potentially due to increased surface roughness of vegetation or lower albedo if snow is present [54]. Thermal data was the most important predictor variable in the three-class model, which suggests that thermal data can amplify the differences between photosynthetically active and non-photosynthetically active vegetation. Overall, incorporating thermal data greatly improves our ability to detect cover crops in Midwest agricultural lands.
Our iterative random forest classification approach allowed us to evaluate the additional predictive value added by each additional wavelength region through a series of increasingly complex models. Furthermore, the variable importance scores of random forest inputs allowed us to evaluate the relative predictive power of explanatory variables (Table 8). In contrast to empirical approaches that test for correlations between biomass and vegetation indices, our machine learning approach allows prediction of cover crop presence on novel regions based solely on remotely sensed inputs. Assessment of variable importance scores indicates that surface temperature has strongest predictive power in both the two-and three-class model (Table 8). Indices in the SWIR were the second most important variables for both the two-and three-class models, underscoring the importance of SWIR for both cover crop detection and crop residue detection. Continuing to improve accuracy is essential given remote sensing's potential future role in monitoring and contributing to public policies, such as carbon markets and incentives for conservation agriculture adoption.
Ongoing improvements in spaceborne sensors means that additional information beyond greenness can be leveraged for agricultural applications including cover crop detection. Full-waveform information from hyperspectral imaging spectroscopy could distinguish photosynthetically active vegetation from crop residues and bare soils, and also allow investigation of individual species that comprise winter cover crops. Additionally, the increasing availability of high-resolution thermal data, like those provided by the ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station (ECOSTRESS) instrument aboard the International Space Station [55], allows for further exploration of the predictive power of thermal data for questions relating to agricultural land-use practices.
The improved ability to accurately detect cover crops, as well as crop residues, in farmlands provides a critical advantage in understanding the ecological effects of these practices on local and regional carbon, water, and energy cycles (e.g., [4,56]). Being able to link carbon sequestration, changes in crop biomass, and nutrient cycling means researchers will be better able to model the carbon and energy balance mitigation potential of sequestration benefits and improvements in nitrate retention resulting from cover crops.

Conclusions and Future Directions: Implications for Policy and Management
This study demonstrated the value of going beyond greenness for the detection of cover crop and crop residues, which have critical climate mitigation and broader ecological implications. Improved detection, even incremental above the promising advantages of incorporating SWIR bands, has important policy and management implications for agriculture. As policymakers look towards additional mechanisms to encourage greater adoption of cover crops to promote carbon sequestration, such as carbon markets [57], the availability of reliable detection methods will be critical to advancing these opportunities as a monitoring tool. The availability of improved detection will also enable better trend analysis that can benefit farmers by identifying how cover crops affect cash crop yields, soil moisture and nitrogen fertilizer retention, and soil organic matter. These opportunities are crucial because farmers in the US Midwest have been slow to adopt zero tillage practices, which leave crop residues on the land surface, and slower still to adopt cover crops [9,58].
The rapid development of improved remote sensing of cover crops and crop residues has shown that relying solely on NDVI to detect cover crops is problematic because once winter cover crops become dormant, it is very difficult to distinguish between cover crops, crop residues, and bare soils that have significantly different ecological benefits but are highly intermixed. Incorporating SWIR bands and thermal data are both important to advancing remote sensing detection of agroecosystems, particularly as sensors improve and provide for more temporally and spatially refined resolutions, which will be very important to account for the variability in management decisions around the timing of planting and termination of cover crops. Thermal data could increase in importance as warming temperatures and changing precipitation patterns facilitate unexpected ecological and managerial responses [59]. Random forest classification is also crucial for prediction on novel data and model interpretability. This work was limited to only five counties in one Midwest US State and looked at fall cover crops only because of the available transect data.
Looking forward, better remote sensing detection of cover crops has the potential to be of critical importance to environmental management and policy. Cover crops are implicated in mitigating water pollution, climate adaption from improved soil moisture retention, climate mitigation through carbon sequestration, and increasing agrobiodiversity by expanding current crop rotations [2,4]. Improving the temporal and spatial resolution of cover crop detection will contribute to more comprehensive analysis of benefits of cover crops from local to regional scales and inform both agricultural and environmental management.  The transect data presented in this study are available on request from the corresponding author. The transect data are not publicly available due to farmer privacy concerns.