3.1. The FADN and the Research Sample
This research used farm microdata from the European Union’s Farm Accountancy Data Network (FADN), a European system for accounting data collection from all member countries of the EU. Data were collected from commercial farms in accordance with a unified methodology [
66]. The farms presented in this study keep accounting records in accordance with the unified FADN methodology, which applies consistently to all entities surveyed and is invariable over time. This makes it possible both to compare farms between each other and to study the dynamics of phenomena. The FADN covers commercial farms that collectively account for 90% of a country’s standard output (SO) (the standard output is the 5-year average of the production value of a specific production activity (crop or animal) obtained in 1 year from 1 hectare or from 1 animal (with the exceptions of edible mushrooms—100 m
2; poultry—100 heads; bees—1 bee trunk, i.e., 1 bee colony), under average production conditions for the region). This provided a basis for selecting the research sample, which, in the case of Poland, comprises ca. 12,000 holdings. Two basic sampling criteria were used: the economic size—determined by the absolute value of the standard output (SO)—and the type of farming, which depends on how much each farming activity contributes to the total SO. The farms selected for this study form part of the Polish FADN research sample. Due to the fundamental importance of changes in the assets-to-labor ratio and labor productivity, the continuity of record keeping throughout the period was essential in the sampling procedure. This study needed to determine how the phenomena evolved over time, and therefore data were retrieved from farms which kept continuous records in 2010–2019. In total, there were 3273 of them in the database, which accounts for ca. 27% of farms covered by the system each year (12,167 in 2019).
The following variables retrieved from the database were used in order to determine how the production function changes at various investment levels: gross value added, depreciation, investment subsidy installments, operating subsidies, total labor inputs, fixed assets, land, permanent crops and production quotas, total output, total labor inputs, total costs, and agricultural land area.
  3.2. Farm Investment Levels
Agricultural overinvestment can be defined as a condition wherein investments are excessively high compared to the production potential [
67]. Two essential parameters need to be developed in order to determine the levels of overinvestment: the assets-to-labor ratio and labor productivity. This paper assumes that increasing the value of farm assets through investments is reasonable if it results in a proportional growth in labor productivity. Higher labor productivity determines the development of farms, including by increasing their efficiency. Therefore, overinvestment is defined as a situation wherein the following apply:
- The increase in the value of assets results in an absolute decline in labor productivity, which may be due to the high maintenance costs of particular assets (e.g., depreciation, insurance, repairs). The above is defined as absolute overinvestment; 
- −
- Labor productivity grows at a slower rate than the value of assets. This is referred to as relative overinvestment. 
The calculations were based on microdata and used the FADN’s system variables. The panel data were created by calculating two-year arithmetic means for each farm, resulting in a dataset spanning 5 periods. Averaging was used to eliminate possible distortions in agricultural markets, such as those caused by productive input prices. Growth and decline rates were then calculated as the next step:
        where LP is the labor productivity; LP
t4 is the labor productivity in period t4; LP
t0 is the labor productivity in the base period; ∆LP is the change in labor productivity; SE410 is the gross value added (includes total production less intermediate consumption, plus or minus the balance of surcharges and taxes on operating activities) (PLN); SE360 is the depreciation (determined based on the replacement value) (PLN); SE406 is the investment subsidy installments (portions of investment grants to be settled within 12 months) (PLN); SE605 is the operating subsidies (other than investment subsidies) (PLN); SE010 is the total labor inputs (AWU).
The gross value added is a key metric of the agricultural efficiency. After the depreciation is deducted, it becomes the net value added, the basic category of agricultural income [
68]. Investment subsidy installments and operating subsidies were deducted as the next step. The reason for using the net added value (rather than the family farm income) was the need to eliminate the cost of external factors (hired labor, rents, and interest on loans) from the calculations in order to have a standardized metric of the economic performances of farms which rely on both their own and external productive inputs in their operations. In turn, subsidies were removed from the calculations because public support should not be considered a metric of labor productivity in economic terms.
Changes in the assets-to-labor ratio were calculated as the next step, with the value of the fixed assets less the value of land per FTEs (full-time employees) used as the metric. The FTE ratio was used so as not to disturb the estimation of the relevant parameters. The rationale behind the above approach is that overinvestment is a problem which ultimately boils down to a mismatch between the output and the extent of investments in machinery and buildings. Just like in the case of labor productivity, the study calculated the average values for the 5 selected periods (1 period is the averages of two years), and defined the growth/decline rate:
        where ALR is the assets-to-labor ratio; ALR
t4 is the assets-to-labor ratio in period t4; ALR
t0 is the assets-to-labor ratio in the base period; ∆ALR is the change in the assets-to-labor ratio; SE441 is the fixed assets (including agricultural land, farm buildings, forest plantings, machinery and equipment, and livestock) (PLN); SE446 is the land, permanent crops, and production quotas (PLN); SE010 is the total labor inputs (AWU).
After calculating the two specifications necessary to determine the investment levels, farm data were distributed between the groups in accordance with the authors’ own methodology:
- Absolute overinvestment: this is the case for farms where the labor productivity drops while the assets-to-labor ratio grows:
			 
- Relative overinvestment: this is the case for farms where both the labor productivity and the assets-to-labor ratio are on an increase but the increase in the assets-to-labor ratio is greater than the increase in the labor productivity: 
- Underinvestment: this is the case for farms where both the labor productivity and the assets-to-labor ratio are on a decline: 
- Optimum investments: this is the case for farms where both the labor productivity and the assets-to-labor ratio are on an increase, and the labor productivity grows faster than the assets-to-labor ratio: 
- Optimum investments with no increase in the assets-to-labor ratio: this is the case for farms where the labor productivity grows while the assets-to-labor neither grows nor declines: 
The five investment levels were drilled down using all possible combinations of labor productivities and assets-to-labor ratios. The assets-to-labor ratio is an important part of the analysis of overinvestment because the production volume has a fundamental impact on labor productivity and depends on the mix of capital and labor inputs [
69]. It is assumed that low levels of the assets-to-labor ratio have an adverse effect on how efficiently labor is used [
70]. As regards the capital-to-labor ratio, higher values are also desirable, as they suggest greater amounts of investment. According to Lewis’ dual economic model [
71,
72], an increase in agricultural labor productivity causes the release of surplus labor to other sectors of the economy and is therefore a prerequisite for economic development. Thus, it is important to take these two variables into account in the context of analyzing the degree of overinvestment.
Table 1 presents the labor productivity for each overinvestment group to show the scale of overinvestment.
 Labor productivity decreased on farms at the absolute and relative overinvestment levels, in the underinvested group, and in those demonstrating optimum investment amounts with no increase in the ALR. This proves that only optimum investment provides an increase in labor productivity. Thus, these farms initially also recorded the highest labor productivity. This means that farms which increased their labor productivity achieved faster growth in their total output than in their labor resources. This is one of the conditions for the development of farms. The assets-to-labor ratio is addressed in the next step (
Table 2).
The results for the assets-to-labor ratio are similar to what was established for labor productivity. A higher value contributes to improvements in labor productivity. The assets-to-labor relationship is an important part of the analysis of overinvestment because the fundamental influence on labor productivity is the volume of production, which depends on the combination of capital and labor inputs [
73]. It is assumed that low levels of the assets-to-labor ratio have an adverse effect on how efficiently labor is used [
74]. As regards the capital-to-labor ratio, higher values are also desirable, as they suggest greater amounts of investment. This, in turn, is to some extent related to the implementation of technical progress in agriculture, which results in the attainment of higher levels of production efficiency [
75].
  3.3. Methodology for Using the Cobb–Douglas Production Function in General and in Research on Farm Overinvestment
Douglas’ research was focused on the elasticity of the labor and capital supply, and on the impact of changes in it on income distribution [
76]. He investigated different structures of the labor market and their impacts on the wage levels and competition in that market. In turn, his research conducted together with Cobb resulted in the development of an innovative production function which describes the relationship between the output (product) and productive inputs, such as capital and labor. They analyzed the elasticity of the production efficiency in relation to different combinations of productive inputs. This allowed for understanding which changes in inputs may affect the output [
77]. The Cobb–Douglas production function has been broadly used in economic and empirical analyses as a way to model and assess the production efficiency in different industries and economies. The classic Cobb–Douglas power function takes the following form [
78]:
        where Y is the output; L is the labor inputs; K is the capital inputs β
0 is the constant determined by technological and organizational progress; 
 are the output elasticities of the capital and labor, respectively.
It was only a refinement of the production function which made it more suitable for agriculture but did not change its formula or interpretation. Supplemented as described above, it can be presented as follows [
78]:
        where Y is the output; L is the labor inputs (expressed as the number of employees, man-days, or FTEs); K is the capital inputs; Z is the land inputs (expressed as the area of agricultural land in ha); 
β0 is the constant determined by technological and organizational progress; 
β1, 
β2, and 
β3 β1, 
β2 i β3 are the output elasticities of labor and capital, respectively. The elasticity values depend on whether they are determined by the available technology [
78].
An important part of the Cobb–Douglas production function is the ability to estimate economies of scale. As reported by Osti [
79], some authors have used it for this purpose. The economies of scale of production are expressed by the sum of the elasticities of capital, labor, and land. If these productive inputs grow by one percent, the output should be expected to increase by the total resulting percentage. The economies of scale were studied by Griliches and Ringstad [
80], who also focused on the way that the production function is formulated. This allowed them to estimate the significance of each parameter comprising the production function. In this paper, it is important to demonstrate the economies of scale in the context of capital, labor, and land. Note, however, that capital itself plays a particular role in the investment t.
Overinvestment is one of the dimensions of inefficiency [
81]. In turn, efficiency can be assessed by analyzing the Cobb–Douglas production function, a mathematical formalization of the relationships between the output and inputs used in the production process [
82]. In this paper, the Cobb–Douglas function was used to investigate the impact of inputs on production volumes at different levels of overinvestment. The variables for the model were selected based on the relevant literature and on data availability in the FADN. The expenditure incurred at the farm level means the use of three productive inputs: labor, land, and capital. The most frequently used mathematical relationship, as mentioned above, is the Cobb–Douglas function, which takes the time factor into account:
Its linearized form is given as follows:
From the properties of logarithms, the following equation is obtained:
        where 
Q—
SE 131 is the total output (PLN); 
L—
SE010 is the total labor inputs (AWU); 
K—
SE270 is the total costs (PLN); 
Z—
SE025 is the agricultural land area (ha); t is the time (years, t = 1, 2, …, 10, t = 1 for 2010); β
0, β
1, β
2, β
3, and β
4 are the output elasticities of labor, capital, and land, respectively.
The parameters β1, β2, and β3, as indicated earlier, represent the production elasticity for each productive input. The parameter associated with the time variable (t) should be interpreted as the average growth rate in the study period (calculated as (β4 − 1) × 100%).
The above model was estimated using the least-squares method. The Cobb–Douglas function models were estimated for all levels of overinvestment. The dataset consisting of microdata for each farm covered by the analysis (and entered into the database) was used to estimate changes in the Cobb–Douglas function model. The farms were attributed to classes in accordance with the previously identified overinvestment levels in order to standardize the database and the results. The model was initially estimated based on 32,730 observations, but 74 of them were excluded due to the incompleteness of the logarithmized data. Thus, 32,656 observations were used in estimating the form of the function and in further analysis.
Table 3 shows the basic descriptive statistics for the variables used in Cobb–Douglas functions. As shown by the analysis of the descriptive statistics for the variables Q, K, L, and Z, their mean values are 308,475.89, 256,332.79, 2.15, and 43.18, respectively. Their medians are lower than the means, which suggests an asymmetric distribution with long right tails. There are high levels of coefficients of variation, with 253% for Q, 303% for K, 133% for L, and 248% for Z. The ranges for the variables Q, K, L, and Z are very wide, indicating the presence of extreme outliers (from −368,172.39 to 30,233,247.81 for Q, from 4299.00 to 32,639,524.00 for K, from 0.11 to 111.35 for L, and from 0.00 to 3487.70 for Z). The skewness values for the variables Q, K, L, and Z are 16.68, 19.84, 19.57, and 16.85, respectively, which means that they are strongly skewed to the right. The right distribution tail is longer than the left one, indicating that the values tend to concentrate on the left side of the distribution. The kurtosis values (435.11 for Q, 569.85 for K, 538.51 for L, and 410.52 for Z) suggest that the distribution of these variables has a higher peak and heavier tails than those of a normal distribution. Once again, it also reveals a great number of extreme values (outliers). Due to the high variability and asymmetry of the sample data and the presence of extreme outliers, the decision was made to explain the variable Q using a Cobb–Douglas function, with the explanatory variables K, L, and Z expressed as terms of a power function and the time (t) as a term of an exponential function. The logarithmic transformation of the variables led to a reduction in the variability and provided a more homogeneous dataset while it also eliminated negative values. The model was cleaned of outliers before further calculations were made, making the results more reliable and universal.