Integration of Object-Oriented Remote Sensing and Machine Learning to Create Field Model for Optimized Regional Agricultural Management

: In an era marked by tools like Artificial Intelligence (AI), Machine Learning (ML) and remote sensing (RS), agriculture is a primary beneficiary. These technologies help to optimize agricultural productivity, by improving resource usage and increasing yield. They not only optimize resource use but also adapt to climate change, necessitating the management of risks associated with agricultural practices. Vegetation Indices (VI) such as the Normalized Difference Vegetation Index (NDVI) are relatively simple yet useful algorithms that can be used to implement precision agriculture (PA). Optical satellite images can sense the reflected lights coming from leaves which can provide various crop development information used to implement PA. This study involves monitoring agricultural production both seasonally and daily using Sentinel-2 multi-spectral time-series data. Time-series images from 2017 to 2022 are analyzed to estimate phenological dates of crops. To understand these stages, a combination of MSAVI (Modified Soil-Adjusted Vegetation Index) and NDVI is used. First, the mean MSAVI is calculated by the year, depending on thresholds, NDVI values are replaced with MSAVI values for certain dates, and phenological dates are determined according to the merged mean Vegetation Index (VI) values. The results are compared with a Crop Progress Report (CPR) published by the United States Department of Agriculture (USDA) with Root-Mean-Square Error (RMSE). After finding the stages, the field is mosaicked for each stage for each year. For the bare soil dates, a Normalized Difference Salinity Index (NDSI) is calculated to understand the change in soil salinity. For the dates of emergence and silking, MSAVI is used. For the dough, dent, mature and harvest stages, NDVI is used. To understand daily changes, object-oriented and pixel-based methods (land segmentation) for field models are used to detect trends in the field. The standard deviation of every pixel is calculated, and clusters are created with the k-means clustering algorithm. The field model includes the characteristics of the field. In PA, site-specific solutions are extremely important to get the optimum results. Since meteorological events have a great effect on agricultural applications, using meteorological data is the main milestone to improve this study. Overall, this research aims to contribute to regional agricultural production and management modules by using remote sensing and machine learning technology.


Introduction
In order to meet the nutritional needs of the increasing population, there is a need to implement agricultural practices in smarter and more strategic ways.These advanced but careful methods are needed not only to meet demand but also to optimize costs while considering sustainability.One of the main information resources to understand a field is looking at its time series history [1].Agricultural RS, which mainly uses surface reflectance information of visible, near-infrared and shortwave-infrared regions of the electromagnetic spectrum, is used to handle site-specific solutions to implement PA.GeoAI (Geospatial Artificial Intelligence) is a sub-field of Artificial Intelligence (AI) which focuses on the applications of AI to geospatial data and problems.GeoAI methods can analyze remote sensing data to make better predictions for implementing PA.
It has been known that NDSI is used at the very beginning of planting to evaluate the soil conditions.It measures the salt content [2] in the soil which can impact the growth and yield of crops.MSAVI can be used to monitor early growth because it eliminates the background effects in areas where soil is not completely covered by vegetation [3].NDVI is used when the crops have grown to a stage where the fields are fully covered by vegetation [4].NDVI is a powerful tool to estimate phenology of crops [5].
Phenological stages of crops refer to the distinct phases in the life cycle of a crop [6].Understanding these stages is crucial for managing crop growth as they determine the optimal timing for various agricultural activities like irrigation, application of fertilizers and pesticides, and harvest [7].Timing is crucial for agricultural practices so that the in-season interventions can be done on time and future seasons can be planned for optimum production.
The motivation for this study emerged from the potential that technology holds for revolutionizing agriculture.Vegetation Indices (VI) such as NDVI, MSAVI, and NDSI can be used to implement precision agriculture (PA), which optimizes resource use, increases yield, handles risk management and so on.Additionally, the usage of ML algorithms for predicting phenological shifts in crops enables optimal agricultural planning for future seasons.Thus, the combination of remote sensing and a data-driven approach can create a transformative impact on regional agricultural practices and policies.
There are some novel methods to estimate phenologic dates [8][9][10][11][12]; however, these works mostly focus on the regional area and use satellite data with low temporal and spatial resolution.In this study, phenologic dates are tried to estimate field level with high spatial and temporal resolution satellite images by using MSAVI and NDVI.For bare soil stages, NDSI used to understand soil salinity.Also, after deciding the dates, phenological stage rasters are mosaicked in between to understand field dynamics in a time series manner.These mosaicked rasters are studied to create meaningful results and clustered according to the temporal data.

Study Site
Illinois is the top farming state in the United States and is in a region known as the Corn Belt.The main crops grown in the state are corn and soybeans (USDA FAS accessed on 10 August 2023).In this study, one field located in Illinois is selected.Corn is the only product grown in this field between the years 2017 and 2022.This information was taken from crop-type maps published by the United States Department of Agriculture-National Agricultural Service (NASS accessed on 10 August 2023).The field is located over 40.128°t o 40.142°N latitude and 88.335°W to 88.35°W longitude and covers an area of about 240 ha. Figure 1 shows the study area and crop type map.

Remote Sensing Data
Sentinel satellites are part of European Space Agency's (ESA) Copernicus Program.In this study, Sentinel 2 bands were used between the years 2017 and 2022.

Crop Progress Report
The United States Department of Agriculture publishes the Crop Progress Report (CPR) which gives information about phenological dates of crops.For this study, CPR data were downloaded, interpolated and masked by selected field area.According to the mean pixel values for every week, a table was created and crop phenological stages were written to the corresponding values.In the CPR report, corn phenological stages are declared as planted, emerged, silking, dough, dent, mature and harvest (Table 1).The plant is collected from the field, either by cutting, threshing, or other means 1 USDA-NASS.

Phenology Estimation Model
In this study, for the detection of phenology stages, a combination of threshold based and slope-based method was developed.The first step of this method was to find MSAVI and NDVI time series values.Satellite images can contain some misinformation because of atmospheric noises, clouds and shadows [13].Pre-processing steps were handled for the new merged time series vegetation index data to eliminate these kinds of noises and misinformation and to smooth the time series data (Figure 2).In this study, a median filter [14] and a Savitzky-Golay (SG) filter [15] were used.First, the median filter was applied, aiming to smooth the time series by removing observations that deviate from the local trend.The process was done by replacing each value with the median value during its five-day temporal moving window.Then, the SG filter was implemented to the filtered time series data.In the SG filter, there are two parameters, window length and polynomial order.The window length specifies how many neighboring points are used for the polynomial fit.The polynomial order defines the complexity of fitting the polynomial.After that, the dates and values of MSAVI having mean values bigger than 0.6 were replaced with NDVI values, and a new data set was created.VI thresholds were determined to estimate emerged, silking and dough stages.With the combined VI dataset, the slopebased method was studied to estimate dent, mature and harvest stages.The planted stage was not estimated, instead bare soil dates were extracted from the MSAVI graph to measure soil salinity.

Field Characteristics Model
Based on the crop stage dates determined with a phenology estimation model, raster data belonging to those dates were mosaicked.For bare soil dates, NDSI images were merged.For emergence and silking stages, MSAVI rasters were merged, and for dough, dent, mature and harvest NDVI rasters were merged.Mosaic maps were created first as year-based, then the cluster information coming from the years were mosaicked again.At the end, one raster layer for every stage was created, and the standard deviation was calculated for the pixels.The time series standard deviation layer gives information for pixel variability over time [16].This information is important to understand the field on a specific site.After extracting the standard deviation raster, with k-means algorithms, clusters were created.These clusters show how the field needs to be cared.

Results
After running the threshold and slope-based phenological detection algorithm, the results were compared to CPR dates.In Table 2, estimated and CPR based phenological dates for the year 2021 can be seen.The same table was created for the other years.For the entire study years, depending on the week of the year (WOY), the phenological dates were marked with CPR dates (Figure 3).The accuracy of the combined estimation method was evaluated using the RMSE (Root-Mean-Square Error) method.To calculate RMSE, start dates of phenological stages were turned in to Day of Year (DOY) format.The overall RMSE for estimating corn phenological stages was calculated as 7.44 days.A stage-specific analysis reveals that the highest error occurred in the emergence stage with an RMSE of 5.58 days, while the lowest error was found in the silking stage with an RMSE of 1.67 days.Other stages like harvest, dough, mature and dent had RMSE values of 2.58, 2.97, 3.51 and 3.54 days respectively.When the emergence stage was excluded from the dataset, the overall RMSE improved to 4.61 days.For the bare-soil dates which are in the range between 4 February 2021 and 16 March 2021, NDSI rasters were mosaicked and clustered to understand the bare-soil trends.The same process was applied for the other years (Figure 4).
Clusters were created depending on the pixel-level standard deviation values.These clusters were named as object-based field characteristics for decision makers to understand the field more precisely (Figure 4).If crop fields are well managed, one would anticipate high variability in conditions.On the other hand, low variability suggests that the land is not being actively managed, or the land is showing a trend that the area possibly has a characteristic problem, indicating that the area should be managed differently than the entire field [16].Therefore, the clusters with high class are showing the problematic areas for different stages.Management decisions should be made based on these classes.

Discussion
The model currently investigates a corn field in Illinois, and its applicability to other crops and regions needs further study.Also, there is room for improvement in the earlystage "Emergence" predictions.Further ground observation work is needed to understand and correct the field characteristics in the field model.Future work can also focus on incorporating other factors like weather conditions and soil nutrient content to improve the model's robustness.

Conclusions
This study exemplifies how the integration of GeoAI, remote sensing and machine learning can improve precision agriculture.It not only offers a robust model for understanding and predicting crop phenological stages but also opens roads for more focused and site-specific agricultural practices.In precision agriculture, the right source, right rate, right time and right place are important.In this study, the right timing is estimated through phenology analysis.The field model, as related to phenology estimation, tries to create solutions for the right source, right place and right rate.

Figure 2 .
Figure 2. Original MSAVI, original NDVI and merged vegetation index data (if MSAVI is bigger than 0.6, it is replaced with NDVI values) (first graph); pre-processing steps applied to merged vegetation index (second graph).

Figure 3 .
Figure 3. Phenological stages found by threshold and slope-based algorithm and from CPR are marked for week of the year.

Figure 4 .
Figure 4. Standard deviations of pixel values in phenological stage rasters and results from the K-means clustering algorithm.A cluster with a high number is calculated depending on highstandard-deviation pixels, a cluster with a low number is calculated depending on low-standarddeviation pixels.

Funding:
This research received no external funding.Institutional Review Board Statement: Not applicable.Informed Consent Statement: Not applicable.Data Availability Statement: The CPR Data was downloaded from USDA-NASS site and accessed on 10 August 2023 https://www.nass.usda.gov/Research_and_Science/Crop_Progress_Gridded_Layers/index.php.Sentinel 2 satellite images were downloaded from Sentinel Hub and accessed on 10 August 2023 https://www.sentinel-hub.com.

Table 1 .
Description of phenological stages for corn.
MatureThe plant is considered frost-resistant, corn is nearly ready for harvesting, the outer layers are open, and no green leaves are there Harvest

Table 2 .
Estimated dates are compared with CPR dates for 2021.