1. Introduction
Since the reform and opening-up, China’s agricultural productivity has increased significantly [
1,
2,
3]. Cereal yield rose from 4659 kg/ha in 1995 to 6272 kg/ha in 2019. Per capita grain output has exceeded the internationally recognized security line of 400 kg for 13 consecutive years. Concurrently, the agricultural labor force has continuously migrated to cities, driving the urbanization rate from 17.9% in 1978 to 60.6% in 2019. This spatial population restructuring and labor transfer are deeply shaped by national macro-policies, including the hukou system reform, regional development strategies, and new urbanization plans [
4]. As rural labor has massively shifted to non-agricultural sectors, farmers’ land-use behaviors have changed. The growing prevalence of off-farm employment has significantly increased farmers’ willingness to transfer (lease out) their land [
5]. To address the risk of land abandonment caused by labor outflow and to optimize resource allocation, the Chinese government has introduced macro-policies to promote rural land transfer. A notable example is the comprehensive implementation of the “Three Rights Separation” system, which eliminates transfer barriers at the property rights level [
6]. Driven by institutional reforms and labor transfer, China’s land transfer rate rose rapidly from 12.0% in 2009 to 37.0% in 2017 (National Bureau of Statistics of China). However, following this rapid expansion, the pace of land transfer has recently stagnated. Recent studies confirm that the transfer rate has fallen back and stabilized at around 36%, indicating a significant slowdown in the expansion of farmland scale [
7,
8].
The plateau in land transfer has sparked ongoing academic debates regarding the driving forces and actual effects of farmland scale expansion. One central controversy is whether farmland scale expansion inevitably improves production efficiency [
9,
10]. No consensus has been reached on this issue due to two main reasons. First, factor endowments and natural conditions vary significantly across countries, leading to different cultivated land characteristics in research samples. Second, the indicators used to measure production efficiency vary widely. These indicators include yield [
10], labor productivity [
11], technical efficiency [
12], and total factor productivity [
13]. Existing literature provides a rich foundation for understanding agricultural production characteristics. However, most studies focus solely on the direct impact of farmland scale on production efficiency. Few studies incorporate the marginal cost of land transfer into their analytical frameworks to explore its dynamic constraints on the optimal farmland scale.
In China’s agricultural practice, land fragmentation [
14] and underdeveloped land rental markets [
15,
16] are the primary constraints driving up land transfer costs. On the one hand, due to the egalitarian land distribution system in the early stages of reform, farmers’ cultivated land exhibits persistent fragmentation, which hinders spatial consolidation. On the other hand, the current land rental market remains highly informal. Transactions frequently lack written contracts (67.4%) or definite lease terms (65.2%), relying heavily on acquaintance networks, which results in a lack of formal institutional guarantees for these transfers (China Rural Revitalization Survey, 2020). This physical fragmentation and institutional market friction jointly inflate the transaction costs of land transfer. When farmers attempt to expand their operations, the increasing number of rented plots escalates the difficulties in information searching, negotiation, and contract enforcement, which manifests as increasing marginal transaction costs for land transfer. According to microeconomic production theory, farmers will stop expanding when the marginal cost equals marginal revenue, restricting the spontaneous expansion of farmland at the micro level.
Against this backdrop, this study aims to explore the intrinsic relationship between farmland scale and technical efficiency under the practical constraints of land fragmentation and an underdeveloped land rental market. Specifically, this paper empirically tests two core hypotheses: (1) whether an inverted U-shaped relationship exists between farmland scale and technical efficiency, and thus the existence of a theoretical optimal scale maximizing efficiency; and (2) whether the marginal cost of land transfer dynamically moderates and significantly shifts this optimal turning point.
The contributions of this study are threefold. First, theoretically, we incorporate the physical characteristics of land fragmentation and the market frictions of informal land rental into a transaction cost framework. By analyzing the objective constraints of an increasing marginal cost on farmland scale expansion, this study provides a micro-level explanation for the current slowdown in the land transfer process. Second, empirically, we employ a Stochastic Frontier Analysis (SFA) model. By introducing a quadratic term for farmland scale into the technical inefficiency equation, we calculate the optimal scale that maximizes the technical efficiency of the sampled farmers. Furthermore, using a constrained interaction model and grouped regressions, we examine the quantitative shifts in this optimal turning point under different natural endowments and market development levels. Third, in terms of measurement, using micro-level household survey data from 10 provinces, we introduce the relief degree of land surface (RDLS), the slope degree, and the village-level land transfer rate as objective proxy variables for physical resistance and institutional friction. This empirically operationalizes and tests the inherently unobservable marginal cost of land transfer, providing empirical evidence for related theoretical discussions.
The remainder of this paper is structured as follows:
Section 2 systematically reviews the literature and underlying theoretical mechanisms regarding the relationship between farmland scale and production efficiency.
Section 3 constructs an analytical framework from the perspective of marginal transaction costs of land transfer.
Section 4 and
Section 5 describe the empirical model specification and data sources, respectively.
Section 6 reports the baseline estimation results.
Section 7 uses grouped and interaction models to test the moderating effect of marginal costs on the optimal farmland scale. Finally,
Section 8 presents the research conclusions and policy implications.
3. Theoretical Analysis
Based on microeconomic production theory, a farmer’s optimal farmland scale is determined by profit maximization behavior. Accordingly, we construct the farmer’s profit function
where
L represents the farmland scale:
where
is total revenue and
is total cost.
represents the exogenous market price of agricultural products in a perfectly competitive smallholder market.
is the agricultural production function. Total cost
comprises two components: direct agricultural production costs
(e.g., fertilizer, labor, and machinery) and the transaction cost of land transfer
. Here,
represents physical land fragmentation and market frictions.
To maximize profit, we take the first-order partial derivative of the profit function with respect to farmland scale
and set it to zero, yielding the First-Order Condition (FOC) for profit maximization:
This implies that the equilibrium condition for the farmer to achieve the optimal farmland scale
is that marginal revenue (
) equals total marginal cost, which is the sum of the marginal production cost and the marginal transfer transaction cost:
Given the inherent nature of agricultural production, household labor and management capacity impose rigid constraints. Consequently, within the rational stage of production, marginal output decreases as farmland scale expands (). This leads to a gradual decline in marginal revenue . Meanwhile, marginal production cost —comprising inputs such as fertilizer, machinery, and labor—is primarily constrained by a given technological baseline. Therefore, under similar technological and price conditions, the core variable causing significant differences in the optimal farmland scale among farmers is the marginal cost of land transfer , driven by topographical features and market frictions.
This study posits that the marginal cost of land transfer primarily arises from the additional transaction costs incurred for each extra plot leased in. This increase is driven by three main factors. First, information asymmetry exists between transacting parties. In highly fragmented regions, obtaining accurate information on the quality and cropping history of numerous individual plots is difficult, particularly for external lessees [
68,
69]. Second, informal contracts (e.g., verbal agreements) dominate the land rental market; consequently, managing a larger number of plots increases the probability of default [
70,
71]. Third, fragmentation prolongs the transaction process. To transfer an equivalent area of land, highly fragmented regions require more individual transactions, directly escalating transaction costs. Therefore, achieving a specific farmland scale in severely fragmented areas requires leasing a larger number of plots. This exacerbates market frictions stemming from information asymmetry and contract instability, ultimately driving a sharper increase in the marginal cost of land transfer.
Based on differences in physical land fragmentation and market frictions (
), we define two comparative scenarios. These scenarios illustrate the dynamic constraints of the marginal cost of land transfer on the optimal farmland scale (
Figure 1).
The “Ideal Scenario” assumes completely contiguous land and a frictionless land rental market. Farmers leasing in land only pay a fixed rent without incurring additional search or default prevention costs. In this case, the marginal cost of land transfer is constant (). The slope of the total cost () curve is mainly determined by stable production costs, showing a relatively gentle upward trend.
Conversely, the “Realistic Scenario” features severe land fragmentation and information asymmetry. Farmers must negotiate with multiple lessors. Furthermore, the default risk of informal verbal agreements rises significantly with the number of plots. Consequently, the implicit transaction cost required for each additional unit of leased land continuously increases. Mathematically, this manifests as an increasing marginal cost of land transfer as farmland scale expands (). Therefore, the marginal cost under the realistic scenario () is higher and steeper than that under the ideal baseline. This drives the realistic total cost (TC) curve to increase at an increasing rate.
Determining the farmer’s optimal farmland scale is an endogenous process that adjusts dynamically with the marginal cost of land transfer. Based on the total revenue (
) and total cost (
) curves (
Figure 2), profit maximization occurs when their tangents are parallel. At this point, marginal revenue equals marginal cost (
). The corresponding horizontal coordinate represents the optimal farmland scale. Agricultural product prices are exogenous, and marginal output diminishes. Consequently, the TR curve is concave with a decreasing slope. Under this constraint, we shift the TC curves of the realistic and ideal scenarios to establish tangency with the TR curve. In the realistic scenario, the marginal cost increases rapidly. Thus, its TC curve satisfies the equal-slope condition (
) at a smaller farmland scale (Point A). Conversely, the slow growth of the marginal cost in the ideal scenario allows equilibrium (
) at a larger farmland scale (Point B).
Mathematical derivations and graphical analysis corroborate a core corollary: given that and that marginal revenue monotonically decreases, the equilibrium condition is inevitably satisfied at a smaller scale in the realistic scenario. Specifically, the optimal farmland scale at Point A must be strictly smaller than that at Point B (). This implies that, under the realistic constraints of land fragmentation and market frictions, spontaneous land transfers can only achieve a suboptimal equilibrium. The nonlinear increase in the transaction cost of land transfer directly caps the upper limit of the optimal farmland scale.
Based on the theoretical analysis, the optimal farmland scale adjusts dynamically with changes in the marginal cost of land transfer. This dynamic process can be summarized into three scenarios. First, factors increasing marginal costs reduce the optimal farmland scale. The land fragmentation issue discussed in this study exemplifies this. Second, factors that do not change marginal costs have no effect on the optimal farmland scale. As shown in
Figure 2, the point of tangency between the total cost and total revenue curves determines the optimal farmland scale; therefore, factors not affecting marginal costs will not alter this optimum. However, various production inputs—such as pesticides, fertilizers, and labor—still influence the overall process, as they constitute the marginal production cost (
) in the earlier formula. Third, factors decreasing marginal costs can expand the optimal farmland scale. Practices in China’s land rental market demonstrate many such factors, including the implementation of land titling policies, the establishment of land trading platforms, and the formation of land cooperatives. These initiatives effectively reduce information asymmetry and transaction time. Consequently, they lower the marginal cost of land transfer and expand the optimal farmland scale.
To analyze the impact mechanism of the marginal cost of land transfer on the optimal farmland scale, the previous mathematical derivations rely on two key simplifying assumptions: price exogeneity and land homogeneity. Given the complex micro-environment of Chinese agriculture, relaxing these assumptions provides a deeper explanation for the dynamic changes in the optimal farmland scale. This also highlights the limitations and potential extensions of this theoretical model.
On the one hand, the ideal scenario assumes uniform and completely contiguous land (homogeneity), thereby isolating the net effect of transaction costs on farmland scale. However, in the real land rental market, land quality often exhibits high spatial heterogeneity. As farmland scale expands, farmers are often forced to lease in marginal plots located farther from the village, with poorer infrastructure or lower soil fertility. This decay of physical attributes not only increases the difficulty of field management but also serves as a key objective driver behind the steep rise in the marginal cost (
) curve in the realistic scenario. This implies that if land heterogeneity is endogenized, the constraining effect of fragmentation on the optimal farmland scale would be even more severe than the theoretical expectation in
Figure 2.
On the other hand, this model treats farmers as price takers in a perfectly competitive market, assuming constant prices for agricultural products and inputs. Although this aligns with the market reality for most Chinese smallholders today, a farmer’s bargaining power in input procurement and product sales increases significantly once their farmland scale crosses a certain threshold (e.g., transitioning into large household farms). In this situation, the price parameter shifts from a constant to an increasing function of farmland scale, thereby slowing the decline in the slope of the total revenue (TR) curve. This suggests that if the market pricing mechanism grants a scale premium, the ceiling for the optimal farmland scale in the ideal scenario (Point B) could expand even further.
Based on micro-level household survey data from 10 Chinese provinces, this paper employs the Stochastic Frontier Analysis (SFA) model to empirically test the theoretical hypotheses. The SFA model measures the distance between farmers’ actual output and the production frontier (i.e., technical inefficiency). Empirically, this indicator captures the efficiency loss caused by deviating from the optimal farmland scale defined in the theoretical model. Following our theoretical derivations, the empirical analysis proceeds in two steps. First, we test for an inverted U-shaped relationship between farmland scale and technical efficiency to verify the existence of an optimal farmland scale. Second, we utilize proxy variables for the marginal cost of land transfer (relief degree of land surface, slope degree, and village-level land transfer rate). Using grouped and interaction term models, we examine how the turning point of the optimal farmland scale dynamically shifts under varying physical resistances and market frictions. Through this design, we leverage micro-survey data to empirically test the nonlinear relationship between farmland scale and technical efficiency, alongside the moderating effect of the marginal cost of land transfer.
4. Method and Model Specification
Since the introduction of the stochastic frontier production function [
72], it has been widely used to assess farmers’ technical efficiency. Various methods exist to measure technical efficiency, depending on data type (cross-sectional or panel) and inefficiency term specifications [
73]. Generally, analyzing the heterogeneity of technical efficiency follows two main approaches. The first approach is a two-step strategy that initially estimates technical efficiency using a production function and subsequently evaluates the impact of exogenous factors via regression. However, Wang and Schmidt [
74] and Belotti et al. [
73] demonstrate that this approach introduces severe estimation biases. The second approach employs a flexible simultaneous estimation method, directly incorporating exogenous factors into the distribution parameters of the inefficiency term
(Equation (7)). Because
follows a truncated normal distribution, its variance is a function of both
and
. Thus, the heteroskedasticity of
can be modeled using a non-constant
, a non-constant
, or both [
75]. This study primarily focuses on the exogenous factors influencing technical efficiency rather than its inherent uncertainty. Therefore, following Kumbhakar et al. [
76], Huang and Liu [
77], and Battese and Coelli [
78], we specify the mean of the inefficiency term (
as observation-specific while keeping
constant.
Equations (4)–(7) detail the specific model, where subscripts , and denote households, crops, and input variables, respectively. Compared to the Cobb–Douglas (C-D) production function, which imposes constant output elasticities, the translog production function relaxes the assumption of a constant elasticity of substitution. This allows it to more flexibly capture interactions and nonlinear relationships among inputs. Therefore, we employ the translog production function to estimate technical efficiency. To further verify the statistical validity of this functional specification, we conduct a Likelihood Ratio (LR) test in the empirical analysis to compare it against the C-D model.
Regarding variable selection, this study follows the microeconomic theory of agricultural production. In Equation (4), denotes the logarithmic yield of crop (wheat, rice, or maize) for household . represents core production inputs, encompassing land (sown area), capital and technology (material costs and purchased mechanization services), and labor (labor costs). and are crop and regional fixed effects, respectively, controlling for unobservable heterogeneity. is an independent and identically distributed (i.i.d.) white noise error term. is the technical inefficiency term, modeled as a non-negative truncated normal random variable with an observation-specific mean and a fixed variance .
In the inefficiency equation, denotes the logarithm of the household’s farmland scale. The model includes the quadratic term of to capture the potential optimum of the farmland scale. The variable indicates the number of plots, controlling for land fragmentation. The coefficients and in Equation (7) are the parameters of primary interest. Significant coefficients indicate a nonlinear quadratic relationship between farmland scale and technical efficiency. Consequently, the optimal farmland scale can be calculated using Equation (8). Because is logarithmically transformed, we apply an exponential function to derive the actual optimal farmland scale in Equation (8).
To refine the model identification and estimation strategy, this study further addresses potential endogeneity biases and model specification dependencies. First, a household’s farmland scale () may not be strictly exogenous. Unobservable heterogeneity, such as a farmer’s farming aptitude and management ability, simultaneously affects their land transfer decisions and technical efficiency. Direct estimation may induce simultaneity bias, thereby confounding the empirical estimates. To address this, we introduce the average farmland scale of other surveyed farmers in the same village as an instrumental variable. Employing the Control Function (CF) approach, we incorporate the first-stage residuals of the endogenous explanatory variable into the main regression to control for and isolate the endogeneity bias caused by unobservable factors. Second, to rule out interference from outliers and the sensitivity of the frontier model to distributional assumptions, we conduct multiple robustness checks alongside the baseline regression. These include altering the distributional assumption of the inefficiency term (e.g., using a half-normal distribution) and removing outliers, thereby ensuring the overall reliability of the empirical results.
5. Data and Description of Variables
5.1. Data
This study utilizes data from the China Rural Revitalization Survey (CRRS), conducted in 2020 by the Rural Development Institute of the Chinese Academy of Social Sciences. The survey employed a stratified random sampling method to select rural households nationwide. First, based on provincial economic development levels and spatial distribution, 10 provinces representing eastern, central, and western China were selected: Heilongjiang in the northeast; Zhejiang, Shandong, and Guangdong in the east; Anhui and Henan in the central region; and Guizhou, Sichuan, Shaanxi, and Ningxia in the west. Second, within each province, all counties were stratified into five groups based on per capita GDP, with one county randomly selected from each group. Next, following a similar procedure, three towns were randomly selected from each sampled county, and two villages from each town. Finally, 12 to 14 rural households were randomly drawn from each village (
Figure 3). The survey yielded a total sample of 3833 rural households, 64% of which were engaged in agricultural production. To accurately match household characteristics with crop-level production information, we merged the main household database with the crop cultivation sub-database, retaining only households engaged in staple crop production.
The survey data contain crop-level information (e.g., inputs, outputs, sown area, and number of plots) and household-level information (e.g., farmland scale and household characteristics). These data form the basis for calculating crop-level technical efficiency and analyzing its determinants. Based on a stratified random sampling design according to provincial economic development and spatial distribution, the survey covers multi-level administrative units across eastern, central, and western China. The final sample objectively reflects the fundamental landscape of contemporary Chinese agriculture—dominated by smallholder farmers while incorporating a certain degree of scale operations—in terms of farmland scale, crop structure, factor inputs, and household characteristics. This provides a highly representative micro-level observational sample for this study.
5.2. Variable Description
Prior to the empirical analysis, we rigorously cleaned the raw data. First, we excluded observations with logical contradictions (e.g., instances where the sum of the three largest plots’ areas exceeded the total farmland scale, or where core economic indicators were non-positive). Second, at the crop level, we winsorized all continuous variables (including yield, farmland scale, number of plots, and various input costs) at the 1st and 99th percentiles. This mitigates potential estimation bias caused by extreme values and unusually large farms. Finally, we retained only the three major staple crops (wheat, rice, and maize) and removed observations with missing values for model variables. This yielded a final dataset of 1988 valid observations (see
Table 2 for the sample screening path).
Table 3 reports the summary statistics for all variables in the final sample.
Production frontier variables: The baseline stochastic frontier model includes one output and four factor inputs. The dependent variable is the logarithmic yield per unit area. Regarding factor inputs, material capital input is expressed as the logarithm of the total cost for fertilizer, pesticide, seeds, and water/electricity (mean = 282.66 Yuan/mu). Labor input comprises actual hired labor costs and the opportunity cost of household labor (calculated using the village’s average non-agricultural wage), measured as the logarithm of their sum. Machinery input is measured as the logarithm of purchased mechanization service fees per unit area. Land input is measured as the logarithm of the actual sown area for a specific crop.
Core explanatory variables: The core explanatory variables for exploring the impact mechanism on technical efficiency are the logarithm of the household’s farmland scale and its squared term. Descriptive statistics show that the average farmland scale of the sampled households is 25.90 mu, ranging from a minimum of 0.8 mu to a maximum of 420 mu. Concurrently, the average sown area at the single-crop level is 16.69 mu, which is significantly lower than the total household farmland scale. Given that 69.1% of the sampled households planted two or more types of crops (an average of 3.95 crop types), this objectively reflects the mixed-cropping practices widely adopted by current smallholder farmers.
Control variables: Based on the household decision-making model, we selected additional control variables at the household and household head levels for inclusion in both the inefficiency equation and the production function. Regarding cropping structure, the number of plots (mean = 3.95) is introduced to control for physical land fragmentation. Regarding household head characteristics, the vast majority are male (96%), with education concentrated at the primary and junior high school levels (81%). Additionally, 14% serve as village cadres, and only 5% are engaged in non-agricultural employment. Regarding household and risk characteristics, the average household size is 3.26; 66% of households were unaffected by natural disasters; and 50% purchased agricultural insurance. Furthermore, the model incorporates crop and regional fixed effects to absorb unobservable heterogeneity.
Mechanism and robustness check variables: To support subsequent empirical identification and extended analysis, we selected three categories of auxiliary indicators. First, to address endogeneity, we employed the average farmland scale of other surveyed farmers in the same village as an instrumental variable (IV). This indicator reflects the overall village land rental environment, thereby indirectly affecting individual farmers’ land transfer decisions without directly impacting their agricultural production processes. Second, for mechanism verification, we selected the relief degree of land surface, the slope degree, and the village-level land transfer rate as proxy indicators for the marginal cost of land transfer. These physical attributes capture the physical resistance constraining plot consolidation and mechanized operations, while the transfer rate reflects the development level of the local market and information frictions. Due to micro-survey data limitations, these objective indicators, while unable to directly quantify individual implicit transaction costs, effectively map the core constraints on farmland scale expansion, providing a valid empirical substitute. Third, for robustness checks, we extracted the actual number of crop types (mean = 3.95) to measure crop diversification and utilized the village-level average off-farm wage during the idle season to recalculate labor opportunity costs.
6. Results and Discussion
6.1. Baseline Regression Results
The technical efficiency calculated using Stochastic Frontier Analysis (SFA) is presented in
Table 4. The average technical efficiency for the full sample is 0.677. This indicates that the technical efficiency loss in China’s staple crop production is approximately 32.3%, leaving substantial room for improvement. Disaggregated by crop, the technical efficiencies for wheat, rice, and maize are 0.708, 0.733, and 0.669, respectively. These estimates are highly consistent with recent SFA studies on Chinese agriculture. Specifically, our full-sample average (0.677) closely matches the 0.675 efficiency reported by Zeng and Hu [
79] for major grain-producing regions. Furthermore, our crop-specific findings align well with Shi and Paudel [
80], who found a 0.74 efficiency for rice, and remain consistent with the broader 0.78 frontier for independent family farms observed by Gong et al. [
81]. This cross-validation firmly corroborates the robustness of our empirical approach. The differences in technical efficiency among crops primarily stem from the objective constraints of China’s multiple-cropping systems and the differentiated resource allocation behaviors of farmers. In major grain-producing regions, a double-cropping rotation system is widely adopted. Wheat and rice mostly serve as preceding or prioritized primary crops, with relatively sufficient farming preparation periods. This enables farmers to invest sufficient agricultural inputs and implement refined and intensive field management. In contrast, maize is predominantly planted as a succeeding crop. Constrained by tight planting windows, soil fertility depletion from preceding crops, and dispersed managerial efforts, the production and operation processes of maize tend to be relatively extensive. Furthermore, in terms of economic positioning, rice and wheat are core staple crops, which attract greater production priority from farmers. Conversely, maize exhibits distinct commercial characteristics, possesses a lower cultivation threshold, and is highly compatible with part-time farming. Driven by the opportunity costs of off-farm employment, labor inputs and field maintenance during the maize growing season are relatively insufficient. The aforementioned rotation constraints and differences in labor allocation jointly lead to the overall technical efficiency of maize being lower than that of rice and wheat. Furthermore, the technical efficiency across the full sample ranges from a minimum of 0.057 to a maximum of 0.954. This wide variation indirectly reveals substantial disparities among Chinese smallholder farmers in terms of agricultural production methods and managerial capacity, indicating that standardized modern agricultural practices have not been widely adopted.
Table 5 reports the baseline regression results. Model (1) presents the estimates for the full sample, while Models (2) through (4) report the results for the wheat, rice, and maize subsamples, respectively. Overall, the estimates align with theoretical expectations. Given our primary focus on the impact of farmland scale on technical efficiency, we concentrate our analysis on the estimates in Panel B (the inefficiency term).
Across both the full sample and the three staple crop subsamples, the linear term coefficients for farmland scale are significantly negative at the 1% or 5% level, while the quadratic term coefficients are all significantly positive. This implies a U-shaped relationship between household farmland scale and technical inefficiency. Given the strict inverse mapping between technical efficiency and technical inefficiency, this strongly corroborates an inverted U-shaped relationship between farmland scale and technical efficiency. Economically, in the initial stage of farmland scale expansion, farmers can more effectively spread the fixed costs of machinery purchases and purchased mechanization services. This achieves optimal factor allocation (economies of scale), thereby driving technical efficiency upward. However, as the farmland scale crosses the optimal turning point, the inherent biological characteristics of agricultural production become pronounced. Constrained by tight time windows during peak farming seasons and physical land fragmentation, farmers’ supervision costs for hired labor and production processes rise significantly. Constraints on managerial capacity ultimately lead to diseconomies of scale, causing technical efficiency to decline.
Notably, after controlling for total farmland scale, the coefficient for the number of plots is significantly negative in the full sample and some subsamples. This indicates that more plots correlate with lower technical inefficiency (i.e., higher technical efficiency). Combined with the objective data in
Section 5 showing an average of 3.95 crops per household, this finding reveals a distinctive logic within the Chinese smallholder production model: moderate land fragmentation provides smallholder farmers with a short-term efficiency compensation mechanism. Farmers employ crop diversification to mitigate agricultural risks. They match specific crops to plot-specific soil characteristics and fully utilize household labor for intensive farming. Consequently, under current constraints, plot diversity translates into a production advantage.
Calculated from the regression coefficients, the theoretical optimal farmland scale for the full sample is 17.74 mu, while the actual average scale is 25.90 mu, exceeding the theoretical optimum. While technical efficiency optimization corresponds to maximum efficiency per unit of output, farmers’ actual farmland scale is jointly determined by profit-maximization motives, production technology characteristics, cost structures, and factor market conditions. This deviation aligns with the current realities of Chinese agricultural production. Disaggregated by crop, the theoretical optimal scale for wheat is 13.60 mu, while the actual average farmland scale is 17.16 mu, exceeding the optimal level. Wheat production is significantly affected by land fragmentation. Dispersed plots drive up the transaction costs for machinery coordination and field management, lowering the ceiling for the optimal scale. Concurrently, to spread the fixed costs of machinery inputs, farmers have an intrinsic motivation to expand their farmland scale, resulting in an actual scale higher than the theoretical optimum. For rice, the theoretical optimal farmland scale is 16.76 mu, while the actual average scale reaches 31.32 mu, exhibiting a more pronounced deviation. As a typical paddy crop, rice production demands strict precision in water and fertilizer management, as well as pest control. Scale expansion significantly increases management and supervision costs, thereby restricting the optimal scale. However, higher profit margins per unit area, coupled with the widespread adoption of purchased mechanization services (e.g., machine transplanting and drying), mitigate labor constraints. Consequently, the actual scale expands significantly. For maize, the theoretical optimal farmland scale is 33.79 mu, and the actual average scale is 27.87 mu, falling short of the optimal level. As a dryland crop, maize is highly amenable to mechanization and requires relatively extensive management. Management costs rise slowly with scale expansion, leading to a relatively higher optimal scale. In reality, however, the insufficient supply of contiguous land suitable for mechanized operations, coupled with high land rental and transaction costs, objectively constrains the expansion of scale. This results in unrealized scale potential.
Importantly, because farmers generally engage in mixed-cropping practices, the estimation results in this section reflect an overall trend. Although we are confident in confirming the inverted U-shaped relationship between farmland scale and technical efficiency—which is consistent with the theoretical analysis in
Section 3—the theoretical optimal farmland scale is not a fixed value. As emphasized by Kumbhakar et al. [
82], efficiency estimates and parameter results are highly sensitive to unobserved heterogeneity and distributional assumptions of the functional specification. First, with varying land availability and the dynamic development of the land rental market, the optimal farmland scale will shift accordingly. Second, potential endogeneity and measurement biases in the empirical model may compromise the reliability of the baseline results. Therefore, in
Section 6.2 and
Section 6.3, we employ an instrumental variable (IV) approach to address potential endogeneity and comprehensively test the robustness of these findings using a series of alternative indicators and model specifications.
6.2. Addressing Endogeneity
The estimation of the relationship between farmland scale and technical efficiency may be confounded by potential endogeneity. Typically, unobservable heterogeneity, such as households’ field management experience and farming aptitude, simultaneously affects their land rental decisions and technical efficiency. Failing to account for this endogeneity may lead to biased estimates in the baseline model. Given that the stochastic frontier model in this study includes a quadratic nonlinear term for farmland scale, the traditional Two-Stage Least Squares (2SLS) method is invalid. Therefore, we employ the Control Function (CF) approach for endogeneity correction. We select the average farmland scale of other surveyed households in the same village as an instrumental variable (IV). Economically, the average farmland scale of other households in the same village represents the overall land endowment and rental market development environment of the village. This constitutes an exogenous objective condition affecting a specific household’s decision to lease land in or out. Furthermore, this aggregate village-level environment does not directly interfere with the micro-agricultural production process within a specific household, thereby strictly satisfying the exclusion restriction of an instrumental variable.
Table 6 reports the two-stage estimation results of the CF approach. The first-stage regression shows that the instrumental variable is significantly positive at the 1% level across the full sample and all subsamples. Furthermore, the first-stage F-statistics (ranging from 50.04 to 251.39) far exceed the empirical critical value for weak instruments, indicating that the IV satisfies the relevance condition. Additionally, the first-stage residuals are significant in the second stage for most models. This confirms the existence of endogeneity bias in farmland scale and justifies the necessity of using the CF approach. After effectively accounting for endogeneity, the signs and significance of the coefficients for the logarithm of farmland scale and its squared term remain robust, once again corroborating the inverted U-shaped relationship between farmland scale and technical efficiency.
Notably, after correcting for endogeneity bias, the theoretical optimal farmland scale for each crop shifted rightward compared to the baseline regression (the optimal scale for the full sample increased from 17.74 mu to 24.47 mu, while wheat and maize expanded to 29.05 mu and 62.45 mu, respectively). As shown in
Table 6, the coefficients for the first-stage residuals in the inefficiency function are mostly significantly positive. This indicates that the unobservable factors driving households to endogenously expand their farmland scale actually exacerbate technical inefficiency. Contextualized within Chinese agriculture, this often reflects an extensive expansion model prioritizing land accumulation over intensive management. Such behavior is frequently driven by non-productive motives, such as capturing scale subsidies, or occurs under conditions of insufficient complementary capital and modern managerial capacity. By failing to effectively isolate the inefficiency increase caused by this suboptimal expansion, the baseline model underestimated the potential of the optimal farmland scale. Importantly, the theoretical optima derived from the CF approach reflect the long-term scale potential under ideal factor matching. In reality, the actual optimal farmland scale achievable by households remains constrained by capital endowments and topographical conditions. In contrast, the theoretical optimal farmland scale for rice exhibits strong relative rigidity (rising slightly from 16.76 mu to 17.76 mu). This further highlights the stringent constraints imposed by the intensive farming practices of paddy fields on households’ managerial capacity, making its optimal farmland scale relatively less susceptible to endogeneity bias.
6.3. Robustness Checks
To verify the reliability of the baseline conclusions, we conducted multiple robustness checks addressing outlier sensitivity, distributional assumptions, functional specifications, variable measurement, and potential omitted variables. The estimation results are summarized in
Table 7. Overall, across all model adjustments, the linear coefficient for farmland scale remains significantly negative, and the quadratic coefficient remains significantly positive. This indicates that the inverted U-shaped relationship between farmland scale and technical efficiency has strong empirical robustness.
First, considering that a small fraction of ultra-large farms might exert a disproportionate leverage effect on the fitted curve, Model (2) excludes observations in the top 1% of the farmland scale distribution. The re-estimation results show that the signs and significance of the core variables remain unaffected. Furthermore, the derived optimal farmland scale is 15.02 mu, which is highly consistent in magnitude with the baseline full-sample estimate, ruling out the possibility that the empirical conclusions are driven by extreme outliers.
Second, to address the sensitivity of the stochastic frontier model to distributional assumptions and functional forms, we conducted alternative specification tests. Model (3) replaces the baseline truncated-normal distribution assumption for the inefficiency term with a half-normal distribution. The significance of the farmland scale variables remains robust, with the optimal scale stabilizing at 11.87 mu. Model (4) restricts the frontier function from the translog form to the Cobb–Douglas (C-D) specification, yielding consistent scale effect characteristics. Meanwhile, the Likelihood Ratio (LR) test statistic is 20.99 (p = 0.021), rejecting the C-D functional form at the 5% level. This statistically confirms the necessity and validity of adopting the translog functional specification in our baseline regression.
Finally, we supplemented the analysis with tests for variable measurement errors and potential omitted variables. In empirical agricultural microeconomics, due to rural labor market imperfections, directly using full off-farm wages to estimate the opportunity cost of household labor carries a risk of overestimation. Therefore, Model (5) recalculates the total cost of labor input using half of the village-level average off-farm wage during the idle season. The optimal farmland scale derived from this regression (17.70 mu) is highly consistent with the baseline result. Furthermore, as shown earlier, surveyed households generally engage in mixed-cropping practices. To avoid omitted variable bias in efficiency evaluation caused by differences in cropping patterns, Model (6) directly introduces crop diversity as a control variable into the inefficiency function. The results show that, after controlling for crop diversity, the inverted U-shaped effect of farmland scale remains highly significant. Additionally, the coefficient for crop diversity is significantly negative at the 1% level (−0.1446). This empirical finding corroborates the analysis regarding the number of plots in
Section 6.1. It confirms that moderate crop diversification helps reduce technical inefficiency, serving as an effective mechanism to mitigate risks and improve efficiency under the current smallholder production model.
8. Conclusions and Policy Implications
8.1. Main Conclusions
This study explores the relationship between farmers’ farmland scale and technical efficiency through theoretical and empirical analyses, and further examines this relationship from the perspective of the marginal cost of land transfer. The results show an inverted U-shaped relationship between farmland scale and technical efficiency; that is, technical efficiency first increases and then decreases as the farmland scale expands. After correcting for the endogeneity bias of farmland scale, the theoretical optima of farmland scale for all staple crops shift rightward. Furthermore, there is significant crop heterogeneity in the optimal farmland scale; the theoretical optimal scales for wheat and rice are lower than that for maize, which possesses stronger suitability for mechanization.
This study further verifies the dynamic adjustment mechanism of the optimal farmland scale. The inverted U-shaped relationship determines the existence of a theoretical optimal scale for household agricultural production. However, this scale is not a fixed value; it adjusts dynamically with changes in the marginal cost of land transfer. The smaller the marginal cost of land transfer, the larger the household’s optimal scale. Empirical tests using farmland topography and the land rental market as exogenous indicators show that flatter relief and slope degrees, along with a more developed land rental market, correspond to lower physical resistance and transaction costs faced by households, resulting in a correspondingly larger optimal farmland scale.
Additionally, the research reveals the intrinsic logic underlying the current smallholder production mode. After controlling for the total farmland scale, an increase in the number of plots and crop diversification actually reduce technical inefficiency. This indicates that under realistic production constraints, moderate land fragmentation and mixed-cropping practices serve as effective short-term compensation mechanisms for households to mitigate natural and market risks and smooth labor utilization.
8.2. Policy Implications
Accelerating land transfer and improving the mechanization and modernization levels of agricultural production are realistic paths for China’s agricultural development. However, in recent years, the growth of the land transfer rate has slowed down, and households’ farmland scales have remained relatively stable. This indicates that current land transfers may have reached a suboptimal equilibrium under realistic constraints. To break this equilibrium and improve overall efficiency, public policymaking needs to shift from a pure aggregate expansion orientation to targeted policies based on objective constraints. The focus can be advanced from three dimensions:
First, implement differentiated land transfer strategies and avoid a one-size-fits-all approach to scale expansion. Topographical features exert a significant constraining effect on the optimal farmland scale. In flat areas, policies can continue to encourage land transfer and contiguous integration to fully realize economies of scale. However, in restricted areas with high relief or slope degrees, policy planning must respect the objective reality that the optimal farmland scale is relatively small. For such regions, the policy focus should shift to promoting small and micro agricultural machinery suitable for complex terrain and cultivating specialty high-value-added agriculture, rather than the uncritical pursuit of large-scale land transfers.
Second, cultivate service-oriented third-party organizations and trading platforms to reduce the marginal transaction costs of land transfer. Optimizing the institutional environment can effectively expand the upper limit of the optimal farmland scale. The government should encourage the development of non-profit third-party institutions, such as grassroots agricultural cooperatives, village collective economic organizations, and standardized land trading platforms. By leveraging the coordinating role of such organizations, fragmented and dispersed plots can be consolidated first and then uniformly leased out to large grain farmers with genuine operational capacity. This approach can substantially reduce the transaction costs for large-scale farmers in information searching, bargaining, and contract signing, thereby promoting the rational expansion of farmland scale.
Third, coordinate short-term realistic tolerance with long-term modernization goals to steadily advance contiguous land consolidation. Currently, moderate land fragmentation and crop diversification are rational choices for smallholder farmers to mitigate natural and market risks and optimize labor allocation. This also forms a suboptimal equilibrium that maintains stable efficiency in agricultural production at the present stage. In advancing land consolidation initiatives, such as “merging small plots into large ones” and constructing high-standard farmland, mandatory measures like forced consolidation and mandatory mono-cropping should be avoided. Before the agricultural insurance system and purchased mechanization service networks can fully hedge the risks of mono-cropping operations, adequate flexibility for moderate crop diversification must be reserved for smallholder farmers, fully accommodating their realistic production and operational needs. From the perspective of long-term agricultural development trends, the outflow of rural labor is irreversible. Mechanization and scale expansion are the fundamental pathways to China’s agricultural modernization. The dispersed operation model, which relies heavily on intensive human labor, objectively constrains the popularization and application of large-scale modern agricultural machinery, making it difficult to adapt to the requirements of modern agricultural development. Public policies need to precisely grasp the balance between short-term tolerance and long-term guidance. Premised on respecting households’ willingness and protecting their legal rights, contiguous land consolidation should be advanced step by step. This will gradually break the long-term constraints of the persistent suboptimal equilibrium, clear spatial obstacles for large-scale agricultural operations and mechanized farming, achieve an organic connection between smallholder production and modern agricultural development, and create foundational conditions for cultivating new agricultural business entities.
8.3. Limitations and Outlook
This study still has certain limitations in data dimensions and indicator measurement, which need to be further addressed in future research. First, the empirical inferences in this paper are based on cross-sectional data, which have inherent limitations in controlling for time-invariant unobservable heterogeneity (such as a household’s inherent management endowment and the long-term quality of micro-plots). Future research could construct high-quality, large-sample micro-level panel data to further isolate individual fixed effects, thereby identifying the causal relationship between farmland scale and technical efficiency more rigorously. Second, when measuring the marginal cost of land transfer, this study selected exogenous proxy variables such as the relief degree, slope degree, and village-level transfer rate. These indicators focus on reflecting the physical resistance and market environment of scale expansion. They fail to directly observe the implicit transaction costs actually borne by micro-level farmers during processes like transfer negotiations, information searching, and default prevention. Future research could design targeted survey instruments to directly quantify these implicit transaction costs. Incorporating these quantified costs into the frontier efficiency model would further improve the testing of the dynamic evolution mechanism of the optimal farmland scale.