Fusion Approach for Remotely-Sensed Mapping of Agriculture (FARMA): A Scalable Open Source Method for Land Cover Monitoring Using Data Fusion

: The increasing availability of very-high resolution (VHR; < 2 m) imagery has the potential to enable agricultural monitoring at increased resolution and cadence, particularly when used in combination with widely available moderate-resolution imagery. However, scaling limitations exist at the regional level due to big data volumes and processing constraints. Here, we demonstrate the Fusion Approach for Remotely-Sensed Mapping of Agriculture (FARMA), using a suite of open source software capable of efficiently characterizing time-series field-scale statistics across large geographical areas at VHR resolution. We provide distinct implementation examples in Vietnam and Senegal to demonstrate the approach using WorldView VHR optical, Sentinel-1 Synthetic Aperture Radar, and Sentinel-2 and Sentinel-3 optical imagery. This distributed software is open source and entirely scalable, enabling large area mapping even with modest computing power. FARMA provides the ability to extract and monitor sub-hectare fields with multisensor raster signals, which previously could only be achieved at scale with large computational resources. Implementing FARMA could enhance predictive yield models by delineating boundaries and tracking productivity of smallholder fields, enabling more precise food security observations in low and lower-middle income countries. of are with time-series Sentinel-1, Sentinel-2 or Sentinel-3 data. In each case, we demonstrate the use of FARMA for retrieving information on time-series changes in agricultural land cover. This approach is fully scalable and can be easily modified, placing it as a tangible method for large-scale systematic analysis for regional scale very-high resolution agriculture mapping and monitoring.


Introduction
An effective means of monitoring agriculture in sub-hectare (smallholder) fields is critical for maintaining global food security, to ensure that it can meet the challenges of a changing climate and population increases [1,2]. By 2050, global populations are expected to increase by 2.25 billion to a total of 9 billion, which will drive a 60% increase in demand [3] on agricultural resources, including food and derived products such as materials and biofuels [4]. Agriculture is particularly sensitive to changes in climate, but failures to keep local temperatures from increasing by more than 2 • C are expected to result in losses in aggregate production of wheat, rice, and maize in both tropical and Data fusion approaches, whereby datasets of different type (modality) and resolution are combined, can maximize the utility of VHR imagery. Using VHR in combination with either lower resolution or more temporally frequent imagery enables the benefits of different sources to be used in tandem, thus maximizing the benefit of each dataset and reducing its limitations [35,36]. Many data fusion approaches exist to maximize the benefits of different sources of imagery for agriculture mapping using a range of methods [16,35,[37][38][39], including the use of object-oriented approaches that utilize advanced machine learning and deep learning algorithms [40]. However, few have been developed that can efficiently merge VHR data with moderate resolution data on a regional scale or larger [41], or are not limited to a single specific application, such as land cover classification. This commonly results in object-based approaches whereby VHR data is used to derive detailed image objects that are representative of land cover parcels and coarser spatial resolution imagery is used for spectral analysis [42,43]. Such an approach was used by McCarty et al. [44] to extract smallholder crops to generate maps of phenology amplitude at 10 m resolution. Over 2700 VHR scenes were used to generate a wall-to-wall segmentation of the Tigray region in Ethiopia using a framework that utilized a large distributed computing network, of up to 30 virtual machines. While scalable, the repetition of such an approach requires access to similar architecture.
A substantial barrier to the more mainstream inclusion of VHR in agriculture mapping is therefore a suitable framework which easily facilitates the fusion of VHR data with other Remotely-Sensed data. Raster-based formats that enable the large processing and analysis of data efficiently, such as the Geographic Object-Based Image Analysis (GEOBIA) outlined by Clewley et al. [45], rely on raster data that require that all input imagery uses the same spatial resolution and grid. This becomes problematic when trying to combine data based on a VHR segmentation, as other datasets must be forced to match this resolution, to the detriment of data quality and data volume. A review of multi-source remote sensing data fusion approaches [35] outlines the existing major methods (e.g., pan-sharpening) but does not list specific frameworks through which data analysis can be conducted. Indeed, the advantages of object-based analysis for combining data are outlined, but the practical components of achieving this with large volumes of data across large geographic areas are not evaluated. Proprietary software, such as Trimble's eCognition (formerly Definiens eCognition; https://geospatial.trimble.com/products-and-solutions/ecognition) and more recently ESRI's ArcMap (https://desktop.arcgis.com/en/arcmap/), do provide mature and capable means of conducting object-based analysis with multi-resolution data, yet these software packages are economically expensive to acquire and cannot be easily scaled without using additional costly computational resources.
Therefore, the overarching goal of this effort was to create a method of fusing VHR imagery with both multispectral and Synthetic Aperture Radar (SAR) Remotely-Sensed data. Our objective was to create an open source, scalable, and efficient framework that could use VHR-derived objects for intricate and detailed mapping across large geographical scales, and efficiently fuse it with commonly available time-series Remotely-Sensed imagery for multi-year analysis. Our motivation was driven by the requirement for a method to analyze small-scale (1 ha) agricultural fields, but which could be suitable to use for any application, using image objects of any size and input remote sensing data of any spatial resolution. Data fusion through this framework enables objects identifiable with VHR to be observed temporally with moderate resolution sensors.

Software and Libraries
The Fusion Approach for Remotely-Sensed Mapping of Agriculture (FARMA) framework relies upon objects (groups of pixels/polygons) that store image statistics for each object within an attribute table, where each row is a unique object ID and each column is a given attribute. This relies upon a suite of open source software, used together to form an automated python workflow. To create, store, and analyze the objects, the following packages are used.
Docker/Singularity 2.1.1. Remote Sensing and GIS Library (RSGISLib) The Remote Sensing and GIS Library (RSGISLib; www.rsgislib.org) is a python library containing over 400 general purpose and specialized algorithms for processing raster and vector format remote sensing data [46]. These include zonal statistics, change detection, image segmentation, and object-based classification across 19 modules. This modular hierarchy enables commands to be used within a workflow that is easily integrated within a Python script. RSGISLib typically provides low-level functions, which can be combined by a user to create a complete algorithm/workflow, enabling flexible use of the software to repurpose commands for numerous different tasks. RSGISLib uses a number of common existing dependencies (e.g., GDAL (www.gdal.org)), which are used both within RSGISLib and as standalone libraries in the FARMA workflow. RSGISLib is therefore the core software on which FARMA is derived, through the use of existing commands linked via additional Python and specific newly created processes. RSGISLib was installed using the Aberystwyth University Earth Observation and Ecosystems Dynamics (AU EOED) research group Docker image: https://hub.docker.com/r/petebunting/au-eoed.

KEA
The KEA file format [47] (https://bitbucket.org/chchrsc/kealib/) is a HDF5-based image file format with a GDAL driver capable of storing header information, GCPs, and a raster attribute table (RAT) within a single file. We use it for its functionality to support raster attribute tables (RAT) and zlib-based compression, including on the RAT, which yields small file sizes. The KEA library (kealib) is released under the Massachusetts Institute of Technology/X Window System (MIT-X) license.

Pandas/GeoPandas
Pandas (https://pandas.pydata.org) is a python library that provides a dataframe environment well suited for reading, analyzing, and writing tabular data from common file formats including comma-separated values, JSON and SQL. GeoPandas (https://geopandas.org) extends the functionality of Pandas to allow spatial operations within common vector datasets such as GeoPackages. Within GeoPandas, file access and geometric operations are performed by Fiona and Shapely, respectively.

GeoPackage
A GeoPackage (GPKG; http://www.geopackage.org) is an open standards-based format implemented as a SQLite database container. It was defined by the Open Geospatial Consortium (OGC) in 2014 and can support both vector and raster data within multiple layers per file. GeoPackage is compatible with all commonly used GIS software and has advantages over existing formats (e.g., ESRI Shapefiles) including unlimited size and the ability to read data when compressed.

Docker/Singularity
Docker is designed to create, deploy, and run applications and software through the use of containers. These containers allow software and libraries to be packaged and distributed with all required dependencies and libraries. This allows the deployment of a single package that has cross-platform functionality. By using the system on which it is running, containers can be smaller in size but operate with increased performances over alternative virtual machines. The Docker container used for this method can be found here: https://hub.docker.com/r/petebunting/au-eoed and contains all the software packages required. Docker is not designed to be used on distributed computing systems as it requires root permissions. Singularity https://sylabs.io/docs/ provides the same functionality as Docker in such cases. The primary benefit of this is that the FARMA framework can be implemented independent of the user's computing platform.

Fusion Approach for Remotely-Sensed Mapping of Agriculture (FARMA)
2.2.1. Image Segmentation FARMA facilitates data fusion through the population of descriptive statistics from Remotely-Sensed imagery into image objects derived from an independent source. Specifically, FARMA is capable of fusing large-scale detailed VHR-derived objects with high-(10 m), moderate-(30 m), and low-resolution (>250 m) Remotely-Sensed imagery across large geographic domains efficiently. FARMA does not segment imagery; therefore, an external segmentation is required which can be derived from any source using any data, as long as it is provided in a rasterized format. Vector format segmentations can be rasterized prior to being used within the FARMA workflow. In two of the examples provided, we used our Agriculture Kernel Otsu Multi-thresh Segmentation (AKOMS) method [48], which applies a dilation and erosion to pixels at specified kernel sizes followed by multi-level histogram thresholding to segment agriculture fields into objects. This generates a binary image of pixel clusters which were clumped using RSGISLib, converting each individual cluster into a uniquely labeled object. A Shepherd [49] k-means segmentation with iterative elimination of objects below a user defined threshold, available within RSGISLib, was provided as an third example. Post clumping, very small (pixel scale) and very large (>10,000 ha) image objects were removed to eliminate the processing of erroneous features. These objects were a consequence of the segmentation process, which was independent of FARMA.

Image Tiling
The image objects were tiled to reduce demands on computing memory and enable the use of distributed computing environments, where available. The workflow is broken down into the following steps and provided diagrammatically in Figure 1. For reference all processing was carried out on an Apple MacBook Pro with 3.5 GHz Intel Core i7 processor with 16 GB of RAM. Step one includes the initial clumping and tiling of the raster segmentation which runs predominantly on a single core.
Step two onwards, which includes the processing of individual tiles and population of zonal statistics, can be run in parallel across a distributed computing environment.
Step 1: Initially a coarse regular grid, at the same dimensions and underlying resolution as the segmentation, was created using RSGISLib. The grid cells can be of any size as required by the user and can be tailored depending upon the computing resources available. Each grid cell had a unique value assigned to it. Each object in the segmentation is then populated with the mode value of the spatially associated (underlying) cell, thus providing a new segmentation composed of mode-object-tiles with soft boundaries between mode-object-tile edges. The minimum and maximum extent of each mode-object-tile is then populated into the RAT using an RSGISLib function and a blank image based on these boundaries is created, in preparation for the extraction of each mode-object-tile.
Step 2: (A) Each object in the mode-object-tiled image, within the bounds of each associated blank image, is exported, creating a mask of the objects in that tile. (B) Simultaneously, the initial segmentation is cut by the tile extent (defined by minimum and maximum extent populated in step 1), creating individual tiles with objects cut with hard boundaries. (C) The mask from step 2A is then used to extract the individual segments (tiled-segments) from step 2B, where each segment has its own unique ID. Each object is then relabeled so that each tile contains objects starting with an ID of 1, where 0 is "nodata". This yields multiple tiles composed of intact image objects.

Image Statistics
Step 3: Each tile of objects is then vectorized to a GeoPackage file (one GeoPackage per tile, containing one layer), creating a number of vector format tiles containing polygons of unique objects. Using functions within RSGISLib, image statistics from independent remote sensing data are populated into the polygons, using descriptive statistics (e.g., mean). Each image band is added to the GeoPackage layer as a unique column in the attribute table. The function in RSGISLib populates the pixel value if the pixel centroid is within the object (polygon) bounds. If the image pixel is larger than the polygon, the pixel value is sampled based upon the polygon centroid. To increase efficiency, each vector is read into memory before statistics are populated, reducing the I/O of the data. Any dataset at any resolution can be populated into the polygons, provided it is in raster format and of the same projection. Analysis can be conducted on the populated polygons, either through the use of Pandas/Geopandas or by populating the statistics back into the associated raster format RAT, which can produce smaller file sizes and enable the data to be read/written in blocks to increase computational efficiency.

Example Studies
Here, we provide examples of the FARMA method for two regions of interest, each with a different approach to observing changes at the field scale. Northern Senegal: (1) broad multi-year photosynthetic trends with Sentinel-3 imagery and (2) annual growing and harvesting cycles with Sentinel-2 imagery; the Mekong Delta, Vietnam: (3) annual rice harvesting cycles with Sentinel-1 imagery. Each is based on a VHR WorldView-derived segmentation, demonstrating that FARMA can use multiple remote sensing modalities and resolutions to efficiently characterize agricultural processes at the field-scale. Examples of two of these processes include (i) the number of rice harvests per year derived from SAR and (ii) field-scale phenology mapped with moderate resolution optical time-series imagery. The examples we provide are to demonstrate the flexibility of the FARMA workflow to use a range of Remotely-Sensed imagery to satisfy a range of applications, but are not focused on the specific processes at each location. Thus, an accuracy assessment is not provided as the results are examples synonymous with expectations of object time-series reflectance or backscatter.

Remote Sensing Imagery
In the examples we provide three different sources of Remotely-Sensed data that vary significantly in either sensor-type or resolution , which are processed through FARMA. This demonstrates that our approach is flexible and can ingest information across a range of remote sensing data types, imaging modes and temporal periods.

WorldView
Commercial VHR MAXAR WorldView (WV) imagery was collected by the National Geospatial Intelligence Agency (NGA) under the NextView license agreement. These data are available to government and non-profit organizations, under this license, that perform research that benefits U.S. government interests [50]. We collected WV-1 0.5 m panchromatic imagery and WV-2 and -3 panchromatic (0.3-0.5 m resolution) and multispectral imagery (1.3-2 m resolution) for segmentation and field object identification from NGA's data archives. We leveraged VHR resolution to define field boundaries and separate individual fields as objects when most fields were sub-hectare and not resolvable with Sentinel-1 or other high/moderate resolution sensors. VHR imagery were processed to top of atmosphere reflectance, orthorectified and resampled where necessary to 0.5 m resolution, using our enhanced very-high resolution (EVHR) products application programming interface (API) [51].

Sentinel-1 Time-Series
Sentinel-1 imagery is collected via two satellites as part of the European Space Agency (ESA) Copernicus constellation. Sentinel-1A and Sentinel-1B acquire SAR imagery every 6 days with a 12 day spacecraft repeat time. Sentinel-1 is a C-band (5.5 cm wavelength) SAR capable of capturing imagery irrespective of cloud cover and atmospheric conditions. The Sentinel-1 imagery catalog has been ingested into the Google Earth Engine Cloud Computing platform, where the imagery has undergone thermal noise removal and radiometric and terrain correction, available in log scale decibel (dB) units of backscatter. For agricultural regions where topography is generally minimized, these preprocessing steps are sufficient. For the following example, 2018 Sentinel-1 imagery was acquired over a region of interest (ROI) using both ascending and descending paths. Imagery within the ROI was converted from decibel to linear units and averaged into monthly time steps, to avoid calculating the geometric mean of the data. The mean image values were multiplied by a factor of 100,000 and downloaded as a 12 band image in integer format and resampled to the local UTM zone using a bilinear interpolation. Each band of the image represents one month of data.

Sentinel-3 Time-Series
Sentinel-3 is a constellation of two platforms operated by ESA, designed to measure sea surface topography, sea and land surface temperature, and ocean and land surface color. It carries numerous instruments, one of which is the Ocean and Land Color Instrument (OLCI) which is a push-broom spectrometer with 21 bands and an image resolution of 300 m, ranging from the visible to the near-infrared (400-1020 nm). Monthly (2017-2019) median Normalized Difference Vegetation Index (NDVI) imagery was downloaded from Google Earth Engine at a resolution of 300 m for the years 2017, 2018 and 2019, providing 3 years of data across 36 image bands. The image was reprojected to UTM, using a nearest-neighbor resampling method. The use of NDVI is sensitive to changes in photosynthetic cover and negates the requirement for atmospheric correction, which is currently not available as a standard Sentinel-3 product.

Sentinel-2 Time-Series
Sentinel-2 is an ESA constellation of two satellites. Each carries a multispectral imager (MSI) that records 13 bands from visible through shortwave-infrared portions of the electromagnetic spectrum (442-2202 nm). Image bands are available in 10 m, 20 m and 60 m dependent upon the wavelength across a 290 km field of view. The repeat orbit period across both satellites is 5 days at the equator providing frequent coverage in regions heavily impacted by cloud cover. The imagery is open access and is available on the Google Earth Engine platform. We created greenest-pixel composites for each month of 2018. The greenest-pixel composite uses the pixel, in a time-series stack of imagery, from the image which has the highest NDVI pixel value. Each month was exported at 10 m resolution as a band in a 12-band image. The image was reprojected to the local UTM zone using a nearest-neighboring resampling method.

Senegal River Valley
Over the past 30 years, West Africa's (WA) population has more than doubled, growing by 2.7% annually. By 2030, 490 million people are expected to live in the WA Sudanian and Sahelian region [52]. The majority of rural residents in the region, which account for 45% of the population [53], are directly involved in agriculture, which is the leading cause of land cover change in West Africa and specifically within Senegal [54]. An increasing population in this environment must face the challenges of meeting both food security and producing crops and goods for export, in order to meet the increased burden of a growing population upon the economy. Rural areas, such as the peanut basin, are the source of primary smallholder staple crops such as maize and millet, with larger industrial crops that produce exportable goods more prevalent in northern Senegal. Specifically, the Senegal River Valley is an important source of valuable cash crops in the region [55] and sugarcane has emerged as one of the principal yields during the past 40 years [56], primarily driven by large agrobusiness for in-country processing and export. However, this relies heavily upon irrigation to provide water in an otherwise arid environment where precipitation has decreased over the past half-century [57]. As such, accurate maps of commercial crop extent and duration are critical for managing resources in the region and predicting yields which underpin the national economy. Here, we provide two examples of fusing a VHR resolution WV-3 segmentation with (1) low resolution Sentinel-3 (300 m) imagery and (2) high-resolution Sentinel-2 (10 m) imagery, to understand agricultural cycles in a region of northern Senegal ( Figure 2).

Senegal Study Site A: Broad Scale Photosynthetic Cover Mapping
Broad patterns within agricultural areas can be used to analyze regional trends that occur across multiple growing seasons. Coarse spatial resolution imagery is useful for this as it typically covers broad geographic areas with small data volumes. A disadvantage of this approach however is not being able to retrieve information for a specific field, or cluster of fields if required. A combination of VHR derived field objects and low-resolution imagery enables this, by providing knowledge on regional trends but within a local-scale setting. For input into FARMA, pan-sharpened 0.5 m resolution WorlView-3 imagery, acquired December 2016 was segmented using the AKOMS method, described above. A total of 20,697,179 image objects were processed, once objects below 0.125 ha were removed, covering a total area of 33,913 ha. The FARMA workflow was then used to grid the segmentation into tiles, each 10,000 × 10,000 pixels in size, creating 35 tiles of data from an image of 31,049 × 119,224 pixels. The Sentinel-3 median NDVI 36-band imagery was populated into each object using the mean pixel value, creating 36 attribute columns representing monthly NDVI, for each of the 35 GeoPackage files. Using Geopandas, the attributes were read from the geopackage and a greenness threshold was set on an NDVI value of 0.1, assigning a value of 1 to objects that met that criteria. For each year, this information was summed and divided by the total number of months and finally multiplied by 100. This provided a new column which represented the quantity of time that an object was green over the time period of 1 year, expressed as a %.
The percentage annual photosynthetic cover was successfully mapped for each object across 3 years of Sentinel-3 imagery (Figure 3). Each object represented the quantity of time that had a photosynthetic cover greater than 0.1 and differences were observed between each year. For the region in Figure 3 specifically, this was observed to be an increase in duration of photosynthetic cover for 2018 over 2017, and 2019 over 2018. Some fields in this region were observed to have had a photosynthetic cover for as much as 50% of the year, represented by warm colors. The reason for this is not known but is interpreted to be connected to factors that either prolong a growing season or promote multiple harvests. Such factors could include favorable growing conditions or the early availability of irrigation. Areas of non-photosynthetic cover had a low duration %, represented by blue colors in Figure 3.
This was achieved using the FARMA workflow in an estimated time of 13.5 h using four computing cores. The initial clumping of the segmentation and creation of tiled objects was completed within 2 h. The conversion to vector format of the 20.5 million objects took 11 h and was the most time-intensive step. The population of the Sentinel-3 pixels into the objects took 40 min to complete. imagery. This demonstrates that the method can determine broad agricultural trends at the field-scale using coarse spatial resolution imagery. An increase in time that the fields contained photosynthetic cover was observed from 2017 to 2019 although the physical cause behind this was not determined.

Senegal Study Site B: High-Resolution Agriculture Monitoring
While the method has demonstrated an ability to map photosynthetic cover in the Senegal River Valley using coarse resolution (300 m) spectrometer imagery, more detailed field-scale information could be retrieved when using higher-resolution data. VHR-derived objects combined with high-resolution time-series imagery would allow agriculture to be mapped on a field-by-field basis, unveiling not only its distribution but the occurrence of the peak and fallow seasons and the number of harvests per year. To assess this, a cloud-free WV-3 multispectral image was acquired for May 2017 over an agricultural region of northern Senegal (Figure 2) at 2 m spatial resolution. Prior to use within FARMA, the image was segmented using the Shepherd [49] algorithm available in RSGISLib. The segmentation was run with a minimum object size of 4000 pixels, based on a qualitative assessment of the objects to best represent the agriculture fields in the imagery. This 4000-pixel threshold is equivalent to 1.6 ha in the imagery, providing a total of 13,372 objects covering an area of 46,686 ha. Sentinel-2 monthly Greenest-pixel NDVI composites, as described above, were populated into the image objects using the mean, following the steps within the FARMA workflow. The FARMA end-to-end processing time took 2 h, using a single computing core.
Each image object was populated with monthly mean NDVI for the year 2018. This information was used to retrieve the month in which the peak NDVI value occurred, on a per-object basis ( Figure 4A). Within the agricultural field objects this occurred almost uniformly in the months of May and June, while non-agricultural objects peaked in months later in the year (September-December). The peak NDVI value for each object was also mapped ( Figure 4B), and the agricultural fields were observed to contain values much larger than the surrounding objects. This suggests that the peak production occurs at a time when vegetation does not naturally flourish and underpins the importance of irrigation in the region. Such information on a year-by-year basis can be used to record temporal changes in peak production, which has a subsequent impact upon yield prediction. Furthermore, some fields were observed to have a low maximum NDVI value, despite being adjacent to high-NDVI fields. The cause of this is not known but may be used to infer fields that are not under cultivation, have become unproductive or have been impacted by potential irrigation shortages during a given agricultural cycle.
As each object contained NDVI information over the course of a year, the temporal signature of different land cover types could also be investigated. As shown in Figure 5A, the agriculture fields show a distinct increase in NDVI during the peak growing season. This is accompanied by a second smaller peak during the fall months. This pattern was the same for a large number of the agricultural fields. Fields which were not used ( Figure 5B) showed a consistently low NDVI, with only a small increase interpreted to be due to natural vegetation cover (e.g., grass cover) with seasonal changes in precipitation. Objects which were consistently vegetated ( Figure 5C) over the year demonstrated a high NDVI value for the majority of the months, but fluctuations were also evident. The change in trend cannot be attributed to a single driver, but the vegetation sampled was located on a river meander. Decreases in the NDVI signal could therefore be due to a decrease in productivity in dry months or flooding during the wet season. In each case the trend in NDVI was successfully retrieved at monthly intervals over the period of one year. where NDVI values were substantially larger than the surrounding land cover. Surrounding low NDVI values that peak in the winter months infer that the agriculture is artificially managed to produce yields at certain times of the year, likely via irrigation.

Mekong Delta, Vietnam
Vietnam is the world's second largest exporter of rice [58], with increased output driven by economic reform and market liberalization during the 1980s, which saw the country move towards modernized farming practices, including the use of new rice varieties, irrigation, pesticides, and fertilizers [59]. This modernization has enabled the Mekong Delta to become one of the world's most intensively cultivated regions for rice production and one of the few locations worldwide that practice triple cropping rice agriculture [60]. Thus, the Mekong Delta provides a large geographical region where rapid changes occur over short temporal periods to test the ability of FARMA to achieve detailed large-scale monitoring. Cloud cover in the region inhibits the use of optical data to monitor these trends, therefore we provide an approach using VHR and radar imagery to fuse a fine-scale segmentation of the region with dense time-series data to monitor rice harvesting. Sentinel-1 data are well suited for this task as it is uninhibited by cloud cover and it is sensitive to differences in plant structure associated with the growing and harvesting of rice. Flooded ponds without rice act as a specular reflector and will result in little backscatter being received by the sensor while the C-band wavelength is sensitive to herbaceous vegetation, such as rice, with backscatter increasing as the rice grows and the biomass increases [17,20,61,62]. Our focus study region is on the provinces of An Giang and Đõng Tháp but extends into other provinces of Vietnam and neighboring Cambodia, dictated by the availability of WorldView imagery ( Figure 6). The fields and field boundaries in this region are often below the size of moderate-and coarse-resolution data, thus accurate segments representing sub-hectare fields are challenging to derive without the use of VHR imagery. Prior to use within FARMA, 88 WorldView images (1 m) covering the period December 2009-February 2019 were mosaicked and segmented using the AKOMS method described above and clumped in RSGISLib to create 2,123,808 objects. This was reduced down to 78,953 by removing objects that were below 1000 m 2 (0.1 ha) and further down to 78,946 with the removal of objects above 100,000,000 m 2 (10,000 ha). Remaining objects covered an area of 394,608 ha across a total image size of 116,500 × 111,000 pixels. Prior to clumping, a binary image derived from thresholds set on Sentinel-1 backscatter imagery was used to mask out pixels that had values outside of the range expected for agriculture (e.g., very bright targets such as urban areas). The segmentation was ingested into FARMA and gridded into tiles of 10,000 × 10,000 pixels and processed according to Section 2.2.2, resulting in 129 GPKG polygonized objects. Time-series Sentinel-1 imagery, as described above, was populated into the objects, yielding 12 attribute columns each representing monthly averaged backscatter values.
The approach was able to successfully characterize time-series backscatter for the rice fields over the period of a year, revealing the growing and harvesting cycles that occurred. This cycle was compared directly with the Sentinel-1 imagery (Figure 7) so that the trend could be corroborated with the Remotely-Sensed signal. For the location shown, three distinct peaks were observed in the time-series record, which is in concurrence with the triple harvesting practices in the region. Peak Sentinel-1 backscatter was detected in February, June and October, followed by large immediate decreases in backscatter. This is synonymous with the harvesting of rice as the highest backscatter was observed as the rice achieved peak biomass, before a minimum backscatter was observed as the rice was removed, leaving a specular reflecting water surface at the next time step. This information was retrieved for all of the image objects across the entire 394,608 ha area, demonstrating that FARMA can provide dense time-series information on a per-field basis across very large geographical areas. The processing of this example, which represents the largest geographical area of all of the study sites provided, took approximately 15 h on an Apple MacBook Pro with 3.5 GHz Intel Core i7 processor and 16 GB of RAM, using 4 cores for multiprocessing. The most time-intensive step was the creation of the individual tiles, from extracting them from the single segmentation to the creation of single vector format tiles (GPKGs). This stage took 8 h using 4 cores but the processing time of each tile varied considerably. Approximately 75% of tiles took ≤5 min each to run while 96% were completed within 30 min each. A second time intensive step was the segmentation into a KEA format RAT and initial tiling of the segmentation in raster format. Due to insufficient RAM, a tiled single-thread version of the command was used but this would require less time if sufficient RAM or if the multi-thread tiled version of the command was used. The clumping took 4 h and the creation of the tiled raster image took an additional 1 h on the 1 m imagery. The population of the zonal statistics took 2 h using 4 cores. The timed process did not include the initial image segmentation nor the acquisition of the data used in the analysis (e.g., Sentinel-1 backscatter), as these are independent of the FARMA workflow. An overview of the processing steps with associated run time is given in Figure 8.

Discussion
The FARMA method has been demonstrated to be capable of fusing VHR imagery with other raster-based sources of information to achieve scalable results for agriculture mapping and monitoring. The method, which uses zonal statistics across tiled vector format segmentations, was efficient at processing VHR resolution objects across large geographical domains using modest computing resources. This is a pioneering method for multi-resolution data fusion for land cover mapping and monitoring. As demonstrations of this, we derived maps of photosynthetic cover expressed as a function of time, peak production (NDVI) period and value, signatures from agriculture and non-agriculture for categorizing land cover types and time-series trends in agricultural cycles. These results were obtained using three separate examples in Senegal and Vietnam that represented different environmental settings and agricultural types. Multi-spectral (2 m) and pan-sharpened (0.5 m, 1 m) VHR imagery was used to derive segmentations on which the object image analysis was conducted and auxiliary data from Sentinel-1, -2 and -3 satellites were used to derive information on the agricultural productivity, on a multi-year basis where required.
This framework provides a novel means for comprehensive agricultural monitoring and is situated as a practical solution to persistent limitations (e.g., computing resources, data volume) in field-scale mapping across large geographical areas. Accurate sub-hectare field mapping and monitoring, provided by FARMA, is critical for assessing food security, both locally in rural subsistence farming and globally across industrial food producing regions [7,48,63]. This is pertinent in West Africa and the Mekong Delta respectively, where different practices at varying scales underpin vital food security and national economies, yet where the risk of malnourishment also persists [6]. Climate change and population increase pose additional future challenges on the agricultural sector to meet increasing demands [5], but potential benefits can be made under these scenarios if appropriate and well informed land management decisions are made [8]. However, these must be relevant at the field-scale level, which has not been possible thus far due to the ongoing under-representation of knowledge at this scale across regional domains [16]. A method of accurately characterizing sub-hectare fields, as presented with FARMA, is the foundation on which effective decisions can be made.
The goal of this work was not specifically to understand agricultural practices as the study sites provided, but to demonstrate that FARMA is an efficient and flexible workflow capable of retrieving such information. FARMA has been able to demonstrate that the method is insensitive to different modalities and resolutions and is thus a tangible method for all agricultural and land cover analysis.

Comparison to Other Methods
The FARMA method is comparable to a number of existing methods [44,[64][65][66] although it has demonstrated important increases in performance, particularly in terms of processing time and is developed by some of the same authors as these currently available methods. However, an important distinction is that the method presented in this study is object-based and the majority of existing examples rely on pixel-based techniques. FARMA is therefore more readily comparable with existing GEOBIA approaches from Clewley et al. [45], Trimble eCognition, and ESRI's ArcMap. FARMA offers considerable advantages over proprietary software by being open source and fully scalable without the requirement for software licenses. The GEOBIA method of Clewley et al. [45] is built upon the same underlying architecture as FARMA (e.g., RSGISLib, GDAL etc) and offers many of the same benefits in terms of scalability, yet it is not suited for multi-resolution analysis due its reliance upon raster grids.
We therefore consider the primary advantages of FARMA to be fourfold: 1.
FARMA provides an efficient and scalable method for the fusion of multi-resolution data. Highand moderate-resolution time-series imagery was populated into the VHR derived objects, but the segmentation and imagery used for analysis can be any combination of sizes. Furthermore, multiple resolution pixel values can be populated into the same image object, allowing a user to analyze multiple raster datasets per object at their native resolution. In addition, the system is fully open source and does not require proprietary software at any stage of the process.

2.
It has the ability to tile the imagery so that smaller portions of data can be processed more efficiently even with modest computing resources. However, within a large distributed computing environment, this enables the advantage of splitting the data volume across a large number of CPUs. The run time subsequently becomes trivial as all tiles can be run simultaneously. A primary advantage of FARMA is that the tile size is entirely customizable, so a user can trade tile size for number of tiles, depending on the computing architecture available. FARMA can be fully operated on a single thread or it can be distributed across as many CPUs as possible.

3.
It is able to tile polygons without creating artificial boundaries between them. While the benefit of processing large data volumes as individual tiled datasets is known, this has not been readily demonstrated with image objects which do not conform to neat tile boundaries. FARMA enables objects to be assigned to a tile using a mode value, subsequently enabling a segmentation to be tiled using soft tile boundaries. Furthermore, by creating these tiles as vector datasets, this has the benefit of enabling zonal statistics to be populated into objects irrespective of spatial resolution while being processed in parallel.

4.
FARMA reads each vector layer into memory before calculating and assigning zonal statistics. While this may pose an issue for very large vectors (>10,000 ha), typically not associated with agriculture fields, this has the benefit of reducing the vector I/O and thus is more computationally efficient. This contributes to the framework being fully scalable.

Performance
FARMA was demonstrated across three examples, each different in the size of the geographical area mapped, the number of objects contained in each segmentation and the resolution from which the segmentation was derived. Senegal B had the lowest run time, completing within 2 h on a single computer core. It covered the second largest area (46,686 ha), but with the lowest number of objects (13,327), derived from the lowest resolution imagery (2 m) and with the largest minimum object size (1.6 ha). The Senegal A and Vietnam examples had similar run times of 13.5 and 15 h, respectively, but were based on two very different segmentations, while Vietnam also used a modified version of FARMA (tiled clumping) due to the large segmentation size. The Vietnam example (394,608 ha) covers a larger area than Senegal A (33,913 ha) but Senegal A contained a far larger number of objects (20.5 million) from higher resolution imagery (0.5 m) in comparison to Vietnam (79,953 objects; 1 m imagery). We infer that the number of objects and object complexity drove the processing time for Senegal A while the large area, greater number of individual tiles and increased object-size drove the processing time for Vietnam. The run time of the examples is therefore dictated by a combination of the size, number, and complexity (number of nodes) of the objects and the number of tiles used. Consequently, comparing the processing times to provide recommendations for using FARMA is challenging and all of these parameters should be carefully considered, in addition to the computing resources available. However, polygonizing the objects was a relative computationally intensive step in all examples, demonstrated by 11 of the 13.5 h run time for Senegal A assigned to polygonizing the 20.5 million objects. If possible, we therefore recommend the removal of objects that are not of interest, such as those in the Vietnam example that were greater than 10,000 ha. Nevertheless, the run time of FARMA was efficient considering the tasks that were achieved. Furthermore, the most intensive steps are required to be run just once, requiring an initial investment of time but which yields low investments of time thereafter. In all instances the population of the independent imagery took a small amount of time (maximum 2 h; Vietnam), thus FARMA could be used as an efficient monitoring system whereby new information is regularly ingested without the requirement to reprocess the objects. As FARMA ingests externally created segmentations it is important to note that segmenting VHR imagery across large geographical domains is a nontrivial and compute intensive task. While this has no bearing on the performance of FARMA, the additional time and resources required to achieve a segmentation should be considered.

Framework Expansion
FARMA is based on a series of algorithms in the RSGISLib [46] Python module, arranged in such a way to retrieve efficient zonal statistics in tiled vectors, accessed, and analyzed using additional Python packages (e.g., GeoPandas). FARMA, therefore, has a modular nature that allows other packages to replace or be incorporated alongside existing ones. This allows new algorithms and workflows to be built on top of the current framework and does not constrain FARMA to one specific task or method. The FARMA workflow is completed with the population of remotely-sensed data into the image objects. As this data is accessible using pandas/numpy, it can be readily used with a wide range of Python modules and existing workflows, thus FARMA provides the user with the freedom to conduct any analysis they wish. Such analysis includes time-series trend monitoring as provided above, but can also include crop type classifications using machine learning algorithms within the SciKit-Learn [67] package, analysis of changes in crop structure (e.g., canopy height), crop biomass and yield estimation, and crop nitrogen mapping and monitoring, if the input remote sensing data are available to achieve this. In this regard, the examples here are only a very small selection of potential possibilities that could be readily achieved. Furthermore, FARMA is not limited to agricultural applications and as such, a segmentation of any land cover type could be processed using the same steps, thus the application is independent of the method and can be broadened to a range of uses.

Data Processing Considerations and Limitations
FARMA was designed for efficient processing, but is still limited by the computing resources available. FARMA can be used on a single-thread system but will not be at its most efficient due to the increased I/O when using a large number of tiles. The benefit of FARMA is therefore maximized when processing very large datasets across a large number of nodes. All analyses presented here were run on an Apple MacBook Pro (3.5 GHz Intel Core i7 processor and 16 GB of RAM) thus processing limitations are not expected to be detrimental if large computing clusters are not available. The initial stage of assigning objects to tiles must currently all be run on a single thread. In cases where the input imagery is extremely large, this could cause errors associated with degraded performance and insufficient RAM. For our largest example, the Mekong Delta, this ran in approximately 1 h on an image with 116,500 × 111,000 pixels at 1 m resolution. We envisage limitations in performance may occur with an image at this resolution at an image size of 1 • × 1 • or greater. To avoid such a bottleneck, the segmentation could be first divided into smaller images to run the initial single-thread step. Once tiled, all individual tiles could be combined into one workflow and run across multiple CPUs. Another second potential source of degraded performance was found to be the presence of very large continuous objects (>4500 ha; 60 min+ processing time each). These can cause a single tile to be very large as it is assigned a mode tile value but can spread across multiple tile space. However, the large objects encountered in this work were a function of the segmentation, which is independent of the FARMA method and were resolved by removing very large (e.g., rivers, >10,000 ha) and very small objects (e.g., single pixel noise), which were not meaningful at the field-scale.
We are confident that FARMA is a complete and capable method of conducting dense time-series analysis of agriculture over large geographical domains. However, there are several potential limitations that exist. Firstly, FARMA is solely object-based and while this provides a number of advantages, there is currently no means of achieving pixel-based analysis if it is required. Furthermore, FARMA is reliant upon an external high-quality input segmentation, but one that cannot adequately resolve field boundaries or target objects will yield unsatisfactory results. This includes field boundaries that may change over time due to changes in cultivation practices and tillage. Similarly, users must acknowledge the limitations of what can be interpreted when combining a segmentation and time-series imagery of vastly different resolutions. In Senegal Study A we merged a VHR-derived segmentation with 300 m pixel Sentinel-3 imagery. While this demonstrates the ability of FARMA to combine different remote sensing data to generate landscape scale analysis, it must be acknowledged that several fields fall within one pixel and thus caution must be taken when interpreting these results at fine scales. Lastly, FARMA has no graphical user interface (GUI) so is reliant upon the user having knowledge of the Python scripting language. As the software is available through Docker, FARMA is appropriate for those with beginner-to-intermediate knowledge of Python.

Conclusions
We provide a framework for efficiently fusing VHR image objects with other Remotely-Sensed data for the purpose of efficient agriculture mapping and monitoring. The Fusion Approach for Remotely-Sensed Mapping of Agriculture (FARMA) is an open source Python-based framework that can tile large very-high resolution-derived segmentations into smaller images for use within a distributed computing environment. Soft boundaries prevent the splitting up of objects, which are subsequently vectorized and populated with statistics from independent imagery using an efficient zonal statistics approach. To demonstrate FARMA, we provide three examples from Senegal and Vietnam, where image objects derived from VHR imagery of agriculture are populated with time-series Sentinel-1, Sentinel-2 or Sentinel-3 data. In each case, we demonstrate the use of FARMA for retrieving information on time-series changes in agricultural land cover. This approach is fully scalable and can be easily modified, placing it as a tangible method for large-scale systematic analysis for regional scale very-high resolution agriculture mapping and monitoring.