1. Introduction
Farmland size has a significant impact on food security, environmental sustainability, and socio-economic development. Developing countries such as those in South Asia and Africa, where smallholder farmers dominate, face a series of challenges in agricultural development. An aging population reduces labor productivity [
1,
2]; limited arable land necessitates reliance on other agricultural inputs (such as pesticides and fertilizers) to increase land productivity. However, excessive application of fertilizers, herbicides, and pesticides can lead to soil compaction, disruption of soil microbial communities, and other detrimental effects, ultimately resulting in soil degradation [
3]. Smallholder farmers’ limited capital accumulation and education often hinder their access to credit and other financial support, slowing the adoption of new production technologies and impeding the modernization of agriculture [
4].
Although information on farmland size is crucial, accurate estimation is not easy. In recent years, thanks to the increased availability of very high-resolution (VHR) satellite imagery, visual interpretation of VHR images can provide insights into farmland size at a given location. In 2015, Fritz et al. initiated the Geo-Wiki field size campaign, mobilizing volunteers to interpret remote sensing images and label farmland sizes in sample areas [
5]. By sampling 13,000 points globally and using inverse distance weighting spatial interpolation, they provided the first estimation of global farmland size at a 1 km resolution. However, due to insufficient sample points and the limitations of the spatial interpolation method, the results of the spatial interpolation were not accurate enough. Lesiv et al. addressed the shortcomings of Fritz’s research by launching another campaign, sampling ten-times more points (130,000) than their predecessor, and designing more meticulous and rigorous labeling specifications to improve estimation accuracy [
6]. Their research contributed significantly to understanding global farmland size, but could not serve as a routine monitoring tool due to the unavoidable subjective interference and high labor costs of manual interpretation.
Designing automatic farmland size mapping algorithms using image processing and computer vision principles can overcome the shortcomings of manual interpretation and has achieved satisfactory results in areas dominated by large farms, such as the United States [
7] and Europe [
8]. Research on automatic farmland size mapping using remote sensing images can be classified into three categories based on principles: edge-based [
7,
9], region-based [
10,
11], and hybrid methods [
8]. Edge-based algorithms identify farmland by recognizing farmland edges, using edge detection operators for convolution operations on images to extract edges with sudden changes in gray values. Region-based algorithms identify spatially similar pixels as the same region, or the same farmland, through region growing from random seed points. Hybrid algorithms combine the characteristics of the first two algorithms. The remote sensing images used in these studies can be divided into two categories based on spatial resolution: VHR images with meter-level or higher resolution and 10 m-level data such as the Sentinel and Landsat series. VHR images contain rich semantic information and often use a large number of labeled samples to train deep learning models to extract plot boundaries and complete plot identification. The latter relies more on expert experience to establish rules for extraction.
Distinct from crowdsourcing and automatic farmland size mapping methods, survey-based methods are the third approach, but the temporal and spatial resolution of the statistical data is very coarse [
12]. Due to high costs, systematic censuses are often conducted only once every several years, and data released at the smallest administrative unit cannot finely depict spatial differences. Additionally, differences in statistical caliber, such as whether cultivated land includes permanent grassland or whether the size refers to farm size or farmland size, have led to significant discrepancies in different scholars’ estimates of global smallholder farming [
13,
14]. According to a study published in the journal World Development, 84% of the world’s 608 million farms operate on less than 2 hectares of farmland [
12]. They use 12% of the world’s agricultural land and produce about 35% of the world’s food. In China, 98% of farmers operate small farms of less than 2 hectares, accounting for almost half of the world’s small farms [
15]. Smallholder farming practices do not match modern agricultural development. For example, fragmented farmland hinders the use of large agricultural machinery, and scattered smallholders find it difficult to accumulate the capital needed for modern agriculture. To address this, the Chinese government encourages land transfer and consolidation, merging small plots into larger ones and concentrating farmland in the hands of a few farmers for large-scale agricultural operations [
16]. This process of farmland concentration, strongly promoted by the Chinese government, is proceeding rapidly. However, there is a lack of suitable methods to track this process timely, accurately, and cost-effectively.
Automatic farmland size mapping algorithms using remote sensing images struggle to achieve satisfactory results in small-scale farming areas with fragmented farmland. Census-based data on farmers, due to its coarse temporal and spatial resolution, cannot finely depict spatiotemporal dynamics. While crowdsourcing, which utilizes online volunteers to interpret remote sensing images and label field sizes, can provide relatively accurate estimations, organizing such campaigns requires considerable human and material resources. Therefore, in the face of the research gap in farmland size measurement methods for small-scale farming areas, the purpose of this study is to develop a farmland size measurement algorithm based on medium and high-resolution satellite imagery. Meter-level VHR images, although they can clearly depict farmland boundaries, are limited in their large-scale application due to data and computational costs. Open-access data policies make 10 m-level medium and high-resolution satellite imagery a better choice for scientific research and practical applications. However, in small-scale farming areas, plots are smaller and more fragmented, and the boundaries between plots are thinner, often less than 10 m, which is smaller than the spatial resolution of the image and not obvious in medium and high-resolution remote sensing images. Therefore, farmland size measurement in small-scale farming areas is very difficult. Our innovation lies in strengthening the boundaries by overlaying multiple edge images and calculating the frequency of edge occurrences to classify the boundaries into different types. Boundaries in farmland can be divided into two categories: permanent edges (PE) and temporary edges. The former can be identified in images throughout the year, such as wider paved roads, while the latter can only be identified in images from certain seasons, such as traces of strip-planted crops or traces of agricultural machinery cultivation. Generally, the larger the farmland, the smaller the proportion of PE per unit of cropland area; conversely, the smaller the farmland, the larger the proportion. Therefore, we can use the proportion of PE per unit area of farmland (Edge Cropland Ratio, ECR) to reflect farmland size.
We selected China’s main grain-producing provinces as the study area, used Sentinel-2 remote sensing images from 2019, and calculated the ECR of sample points from the Geo-Wiki field size campaign. Comparison with manually labeled results verifies the feasibility and scientific validity of this method. The remote sensing images and computing resources provided by the Google Earth Engine cloud computing platform provide an opportunity for the large-scale application of this method. Therefore, this study attempts to answer the following questions: Can the ECR proposed in this study reflect farmland size? What is the appropriate radius size for sample points?
2. Materials
2.1. Data
The remote sensing data used in this study is the Sentinel-2 surface reflectance data from the European Space Agency (ESA, Paris in France). The 10 m resolution red and near-infrared bands are used to calculate the Normalized Difference Vegetation Index (NDVI). Vegetation indices effectively highlight the differences between vegetated and non-vegetated areas. Among them, NDVI is a long-standing and widely used index, calculated based on the 10 m red and near-infrared bands, thus preserving the high spatial resolution. Other vegetation indices require the use of 20 m or 60 m bands for calculations, resulting in a reduced spatial resolution. This data is acquired by the Sentinel-2 A/B satellites, which can achieve a revisit cycle of 3–5 days, and even 1–2 days in high-latitude regions. Data from 2019 was selected to maintain consistency with the year of Lesiv’s data.
We also utilized the 10 m global land cover product WorldCover provided by ESA to determine cropland areas [
17]. The nominal year of this product is 2020, and it is produced using Sentinel-1/2 data, representing one of the most advanced land cover products currently available.
The farmland size annotation dataset comes from the research of Lesiv [
6]. Lesiv’s Geo-Wiki Field Size campaign collected a total of 130,000 sample points globally. Through crowdsourcing, volunteers were mobilized to label the size of the largest farmland appearing around the sample points, referencing high-resolution remote sensing satellite images from providers such as Bing and Google, as well as medium and high-resolution satellite images from Landsat and Sentinel.
The data was preprocessed using the Google Earth Engine Python API [
18], downloading NDVI remote sensing image tiles to the local computer. Python libraries such as scikit-learn, rasterio, and numpy were used for processing and calculating the ECR.
2.2. Study Region
We selected six major grain-producing provinces in China, located in the Northeast China Plain and North China Plain, as our study area (
Figure 1). These provinces include Inner Mongolia, Liaoning, Shandong, Anhui, Hubei, and Henan. The study area encompasses important grain production regions in China, with diverse crop types and cropping systems. Statistical yearbooks (
https://data.cnki.net/yearData (accessed on 15 July 2024)) show that in 2020, the study area contained 40.8 million hectares of cultivated land, producing 84.668 million tons of summer grain and 85.669 million tons of wheat, accounting for 32%, 59%, and 64% of China’s total production, respectively.
The Northeast China Plain has a temperate continental monsoon climate with cold, long winters and warm, short summers, with precipitation concentrated in the summer. The main crops grown are spring wheat, corn, soybeans, sorghum, and rice. The cropping system is single cropping, with sowing mainly in the spring and harvesting in autumn. Due to the short growing season, most crops are early-maturing varieties. The Northeast China Plain has fertile soil, with black soil being widely distributed, suitable for large-scale mechanized farming. However, it is prone to natural disasters such as drought and sandstorms in spring.
The North China Plain has a temperate monsoon climate with four distinct seasons and precipitation concentrated in the summer. The main crops grown are winter wheat, corn, soybeans, peanuts, cotton, vegetables, and fruits. The cropping system is double or triple cropping. Winter wheat is sown in autumn and harvested in the summer; corn, soybeans, and other crops are sown in spring or summer and harvested in autumn. Some areas can also grow a season of vegetables or other cash crops. The North China Plain has flat terrain, fertile soil, good irrigation conditions, and a long history of agricultural production. However, water resources are unevenly distributed, and some areas experience drought and flooding.
We overlaid the farmland size dataset onto the study area, using color and point size to display different levels of farmland. According to Lesiv’s estimates, large-scale farmland (L, XL) is mainly located in the Northeast China Plain, while the North China Plain is dominated by very small farmland (XS).
3. Methods
The algorithm proposed in this study utilizes one year of Sentinel-2 imagery to calculate the ECR index, which reflects farmland size. The entire process, as illustrated in
Figure 2, can be divided into three parts: (1) In the preprocessing stage, image tiles of the sample area are acquired, and vegetation indices are calculated. (2) In the edge extraction stage, edge detection algorithms are applied to identify edges from the vegetation indices. The frequency of edge occurrences is calculated, and the Otsu automatic thresholding algorithm [
19] is iteratively applied twice to identify permanent edges. (3) In the analysis stage, we compare the estimated ECR values at different radii and select the most suitable radius.
3.1. Preprocessing
We selected samples from the Geo-Wiki Global Field Size estimation dataset that fell within our study area and obtained Sentinel-2 remote sensing image chips covering a certain range around these samples. The QA band of the images indicates pixel quality, such as whether there is cloud contamination. Based on the QA band, we calculated the percentage of clean pixels in the tile area and retained tiles with a clean pixel percentage greater than 99%, discarding contaminated image tiles. We used the 10 m red and near-infrared bands to calculate the NDVI, maintaining the original spatial resolution (10 m). The NDVI highlights the difference between vegetation and non-vegetation areas more than the original spectral bands. Its value ranges from −1 to 1, with denser vegetation having higher values; vegetated areas typically have values above 0.2.
3.2. Edge Extraction
Inside the farmland are dense crops, while the edges are roads or other dividers with sparser vegetation. On the NDVI image, there is a gradient difference from the farmland to its edges. After converting the floating-point NDVI image to integer data by multiplying it by 10,000, the Canny edge detection operator was used for convolution to extract areas with abrupt grayscale changes, i.e., farmland edges. Compared to other edge detection operators like Sobel and Laplacian, Canny performed better in this regard [
20]. The relevant parameters were determined through trial and error, set at 100 for the threshold and 1.5 for sigma. At this setting, more edges can be identified.
Binary edges were extracted from each NDVI image and then summed to get the number of times edges were marked in the NDVI image over a year. Its typical probability density distribution is shown in
Figure 2, which can be divided into three categories from left to right: internal farmland areas, temporary edge areas, and permanent edge areas. Temporary edge areas could be identified in some NDVI images throughout the year, such as strip planting marks within the farmland, which disappear after the crops grow vigorously. Permanent edges (PE), on the other hand, can be seen in almost all images throughout the year. We used the land cover product WorldCover to extract cropland in the sample area, and then used the automatic threshold segmentation algorithm Otsu [
19] iteratively twice to automatically classify the three types of farmland areas. The first threshold (otsu_1) divides the farmland area into edges and non-edges, and the second threshold (otsu_2) further divides the edge area into PE and Non-PE. The PE ratio of the farmland area, i.e., the ECR, was calculated, corresponding to the area to the right of otsu_2 in the probability density distribution graph. Ideally, the larger the farmland size, the smaller the proportion of PE in the segmented farmland, i.e., a lower ECR.
3.3. Radius Analysis
The ECR indicator is affected not only by the actual farmland size but also by the estimation radius of the sample points. Therefore, we conducted a robustness test on the radius of the ECR to determine the optimal radius. To preserve heterogeneity, a smaller radius is needed. However, the smaller the radius, the less farmland area is included in the sample, and when it is small enough, the statistical distribution of the edge counts will not be typical, and the Otsu threshold segmentation algorithm based on a histogram distribution will not be able to find a suitable threshold, resulting in more outliers in the ECR estimation. The radius used for the interpretation of the Geo-Wiki dataset is 200 m, so we started testing from 200 m, then testing 200 m, 400 m, 800 m, 1600 m, and 3200 m.
3.4. Validation and Assessment
The Geo-Wiki Field Size dataset is currently the most accurate publicly available global farmland size labeling product [
6]. We compared our ECR estimation results with it to validate our algorithm. There were a total of 1792 sample points in the study area for this dataset. Based on the farmland size within a 200 m radius grid around the sample point, three volunteers interpret high-resolution remote sensing images to determine the dominant farmland size, divided into five levels: XL for >100 ha, L for 16–100 ha, M for 2.56–16 ha, S for 0.64–2.56 ha, and XS for <0.64 ha. In addition, there is “No field”, indicating no farmland within the range, and “NA” for no label due to inconsistent labeling by the three volunteers or a lack of available satellite imagery for interpretation. It is worth noting that labeling based on the dominant farmland size, rather than the average farmland size, overestimates farmland size in areas with mixed large and small farmlands.
The reference data is an ordinal variable, while our calculated ECR is a continuous variable. We conducted three forms of comparison: First, we used box plots to show the ECR distribution of different parcel size levels to visually demonstrate whether this variable can reflect parcel size differences. Then, we discretized the continuous ECR variable into an ordinal variable consistent with Geo-Wiki labels, with the thresholds determined based on correlation coefficients and expert experience. We used a confusion matrix to show the consistency between the parcel size predicted by the ECR and the manually labeled parcel size, and calculated accuracy metrics such as the user accuracy, producer accuracy, and overall accuracy. Finally, we performed a Spearman test on the two variables to test their correlation. The Spearman rank correlation coefficient (Equation (1)) can assess the association between two ordinal variables [
21]. Its value ranges from −1 to 1. Greater than zero indicates a positive correlation between the two variables, less than zero indicates a negative correlation, and equal to zero indicates no linear correlation between the two variables. The absolute value of the coefficient indicates the strength of the correlation, with values closer to 1 indicating a stronger correlation.
5. Conclusions and Discussion
Estimating farmland size in smallholder farming areas using publicly available medium- and high-resolution satellite remote sensing data is a current challenge in farmland size estimation. This study developed a farmland size estimation method suitable for smallholder farming areas using Sentinel-2 imagery. We used the Northeast China Plain and the North China Plain as study areas and conducted robustness tests on the radius size of the samples. A radius of 1600 m is recommended for this study area. At this radius, the ECR in North China was concentrated at 0.085, while the Northeast Plain was concentrated at 0.105. This is consistent with the farmland size distribution revealed by the Geo-Wiki plot size annotation dataset. After converting the continuous variable ECR into an ordinal variable using the thresholds 0.065, 0.074, 0.085, and 0.099, the Spearman correlation coefficient with the reference value reached 0.315, and the p-value was much less than 0.001. This proves the feasibility and scientific validity of the method. The method uses publicly available remote sensing data with low computational costs and self-adaptive algorithm parameters, which has good spatial generalization ability. In the future, large-scale farmland size estimation can be carried out using this algorithm.
5.1. Comparison with Crowdsourcing-Based Visual Interpretation
The farmland size predicted based on the ECR was consistent with the visually interpreted results in small farmland areas, but there were significant differences in large farmland areas. This discrepancy is not due to algorithm flaws, but rather to differences in interpretation rules: visual interpretation labels are based on the size of the largest farmland within the sample area. Moreover, if the largest farmland exceeds the sample area, interpreters can determine the size of the farmland based on the surrounding context. Therefore, using only a 200 m radius, theoretically, the maximum farmland area is 16 ha (400 m × 400 m), but farmland exceeding 100 hectares can be labeled. Our ECR algorithm reflects the average size of farmland within the sample area, not the maximum value. It strictly calculates based on the sample area and cannot utilize background information outside the sample area.
Crowdsourcing (or citizen science) involves public participation in non-profit academic research, allowing researchers to collect information at a lower cost. However, the efforts of participants still represent a significant human cost. Additionally, the quality of the collected information depends on the participants’ competence, which may conflict with the reproducibility requirements of scientific research. Therefore, after obtaining reference results for global farmland size through citizen science, it is still necessary to strive for the development of automated methods for farmland size estimation.
5.2. Comparison with Automatic Field Boundary Delineation
Automated farmland boundary identification algorithms require the accurate identification of farmland boundaries to estimate farmland size. However, the prerequisite for the accurate identification of farmland boundaries is that they are clearly visible in remote sensing images. This requires remote sensing data with sufficiently high spatial resolution. After initially identifying farmland boundaries, the post-processing stage often involves complex morphological operations to correct topological errors in the boundaries. The design of morphological rules assumes that the shape of farmland is regular, such as circular or rectangular. Therefore, automated farmland boundary identification algorithms face many challenges in small farmland areas.
5.3. Prospect and Limitations
Large-scale, regular monitoring needs to consider data costs and algorithm generalization ability [
22,
23]. In terms of data, although earth observation technology is very advanced and can observe the earth’s surface at a sub-meter level, visual interpretation through crowdsourcing can assess global farmland size, but the high labor cost makes it impossible to use as a routine monitoring method [
5,
6]. Automated farmland boundary identification algorithms are not yet mature enough for application in small farmland areas, and cannot balance cost and efficiency [
10,
24]. In terms of algorithms, although there are many object-oriented classification algorithms such as Trimble eCognition software version 9, which provides many object-oriented classification algorithms, and deep learning models represented by the Segment Anything Model (SAM) [
25], these algorithms have some limitations. For example, the multi-resolution segmentation algorithm commonly used in eCognition software v.9 has many parameters that need to be adjusted [
26]; deep learning models require very high computational costs and a large amount of training data [
24].
We used publicly available ten-meter Sentinel-2 satellite data and designed an algorithm based on adaptive parameters that can reflect the farmland size in small farm areas. Some factors affecting the ECR, such as topography, lighting conditions, climate, and farming systems, will be further considered in future research. This algorithm can be migrated to the Planet satellite constellation in the future [
27]. Although this public satellite data cannot match commercial satellites in terms of spatial resolution, its advantages such as its high revisit cycle, more accurate radiometric calibration, and lower data cost (free) make it the best choice for large-scale, long-term monitoring.