The Balanced Scorecard as a Tool Evaluating the Sustainable Performance of Chinese Emerging Family Farms—Evidence from Jilin Province in China

: The purpose of this paper is to apply the Balanced Scorecard (BSC) concept to the sustainable performance evaluation of emerging family farms in Jilin, China. A sustainable performance evaluation system was constructed based on the BSC. A questionnaire survey was used with a sample of 156 emerging family farms involved in the production of planting (grain, horticultural crops) and breeding (animal products) enterprises in Jilin, China. The Analytic Hierarchy Process (AHP)-Fuzzy Comprehensive Evaluation method (FCE) was used for the sustainable performance evaluation by di ﬀ erent BSC dimensions, farm types, and regions. This empirical study revealed that the BSC is applicable for the farm sustainable performance evaluation in the Chinese context. The key is selecting suitable indicators for the evaluation index system while considering the particularity of market, resources, management, and personnel. The sustainable performance of the investigated family farms is in the slightly above moderate level as a whole. Financial performance and market performance are above moderate, while internal business process performance is moderate, and learning and growth performance is below moderate. They are facing di ﬃ cult challenges to upgrade in terms of marketing and ﬁnancing channels, branding, and organic production. Industrial di ﬀ erences existed in the farms’ sustainable performance. Farms combining planting and breeding have better sustainability, which could be a signal for transformation of the traditional single planting or breeding modes in China. The internal business process performance of grain farms is signiﬁcantly less, due possibly to long-term policy support and protection with less of an emphasis on ecological outcomes. Subtle regional di ﬀ erences in the overall sustainable performance of surveyed family farms suggest that farm performance depends more on management than on external environment.


BSC as the Conceptual Framework for Sustainable Performance Evaluation
The evolution of enterprise performance evaluation can be summarized by the financial model and the balanced model. Since the 1980s, with the fierce global competition, researchers have improved the traditional financial performance evaluation model, and a series of performance evaluation views and methods has appeared. The obvious trend is that non-financial indicators have been included in the enterprise performance evaluation system, and the balanced model of performance evaluation, Life Cycle Theory proposes different strategic objectives at various phases, one thing in common is to maintain balanced and sustainable development.
Therefore, the sustainable performance evaluation of the enterprise requires analysis of qualitative indicators, so as to combine the internal and external environments, financial and non-financial indicators, and realize the sustainable development. The sustainable performance evaluation model should involve every factor that affects the operation of the enterprise: whether financial results have reached a predetermined level, whether customers have shown healthy growth, whether members have grown up with good learning and education, and whether the farm has continued to innovate. The Balanced Scorecard evaluation system involves four dimensions: financial, customers, internal business processes, and learning and growth. The four dimensions basically cover the factors that affect the enterprise's sustainable performance evaluation. As such, the Balanced Scorecard can well reflect the requirements of sustainable development for balance and sustainability, so it can be used as the conceptual framework for sustainable performance evaluation.

Characteristics of Family Farm Sustainable Performance
Family farms constitute over 98% of all farms and cover 53% of agricultural land globally. Across distinct contexts, family farming plays a critical role in global food production [11]. Management requirements for farming are high and will increase in the future, especially for enlarged family farms, due to the challenges of structural changes in agriculture, volatile agricultural product markets, rising operating costs, and growing competition for land [4,12,13]. In order to be successful in the long term, the modern farm business must be resilient and structured to buffer business uncertainty, while at the same time gain flexibility to respond to new opportunities in the marketplace and have the capacity to generate sufficient funds to support growth in real terms [14]. This implies that the farm manager must have the management skill and knowledge to handle risk, make the decisions that will capture opportunities that arise, ensure resource capacity is enhanced and maintained at all times, and minimize adverse social and environmental impacts off-farm [14]. In order to explore this management challenge in greater depth, we need to address one of the important issues when assessing farm business, that is, family farm sustainable performance.
In the most commonly used and established interpretations of sustainability in farming practice [15][16][17][18][19][20][21][22], sustainable agriculture is seen as a system aiming at improving environmental quality and the natural resources upon which the agricultural economy depends, and enhancing the quality of life for farmers and rural communities, while at the same time sustaining the economic viability of farm operations over the long term.
The above interpretation of sustainable agriculture encompasses the key aspects of sustainability and their applicability in agriculture. The main objective of sustainability in agriculture is to combine the economic, social, and environmental dimensions while taking into account continuity over time. Family farm sustainable performance can be defined in a number of ways: financial performance, productive performance, human resource development, or measured in terms of social or ecological performance, or output efficiency [14]. This enables the use of a holistic approach in the sustainable performance evaluation of family farms. Numerous instruments, such as accounting software, crop field cards, and sow and cow planners have been developed for daily management practice. However, these management tools often concentrate on traditional financial measures and are always insufficiently integrated to provide a clear picture of the outcome of the farming business. Exploring comprehensive approaches to assess the farm business and to identify as well as monitor closely the drivers of the outcome is necessary. Furthermore, the elements of the evaluation index system, which should include indicators describing the condition and prospects of farms, are essential in assessing family farm sustainability. Farm sustainable performance is a complex reflection of multiple factors, indicators, and structures, so a comprehensive evaluation system from multiple dimensions and levels is needed to assess a family farm's sustainable performance accurately and allow appropriate management decisions geared towards sustainable agriculture.

Family Farm Sustainable Performance in the Chinese Context
The evaluation of family farm sustainable performance is crucial in the Chinese context. In the past few decades, China has realized a dramatic increase in crop production, as well as economic and social improvement in rural areas [23]. The family farm in China is a new agricultural business entity, which is defined as engaging in scale, intensive and commercialized agricultural production and management, with family members as the main labor force and agricultural income as the main source of household income [24]. Many family farms have been emerging in China since the development of the family farm was proposed as a new type of agricultural organization to adapt to the new situation of agricultural development by the Communist Party of China (CPC) Central Committee in 2013 [23]. In 2015, the Ministry of Agriculture in China conducted a nationwide survey on the number, land area, types, and economic performance of family farms in 30 provinces, autonomous regions, and municipalities (excluding Tibet), and these are considered to be the latest and most comprehensive statistics on family farms in China so far. The macro data in this paper come from the statistics data of this special survey. There were 142,000 family farms engaged in planting by the end of June 2015, accounting for 61.90% of the total number of family farms. There were 84,000 farms engaged in grain production, accounting for 59.0% of the total number of planting family farms, followed by breeding farms (25.16%), which are farms combining breeding and planting (8.96%) [23]. The proportion of planting family farms increased by 0.66%, while the proportion of family farms combining planting and breeding increased by 1.14% [25]. The number of farms combining planting and breeding has increased much faster than the other types. In 2015, the total annual sales of the agricultural products of family farms was RBM 126 billion, and the average annual sales of each family farm reached RMB 368 thousand, an increase of 17.57% from that of 2014. The annual sales of most family farms are between RMB 100,000-500,000 and below RMB 100,000, accounting for 44.2% and 33.3%, respectively. However, the high production cost led to a low net profit for family farms. According to survey data, after the annual sales minus the agricultural production cost, the average annual profit of family farms in China in 2015 was only RMB 196 thousand [25].
Most Chinese family farms are developed from small farming households, so they are quite similar to the traditional small farming households in agricultural production and management, and different from the state-owned farms in China and the large-sized family farms in western countries. Based on a small farming household economy, Chinese family farmers care more about short-term economic benefits or find ways to obtain policy supports or subsidies, and lack long-term development plans, comprehensive performance and sustainable development capabilities. For these emerging Chinese family farms, how to keep a vigorous vitality, ensure high competitiveness, and explore future development potential are quite essential in such a context. Chinese family farms also take a certain scale of land as the basis for agricultural production. According to the data of the Ministry of Agriculture in China, the average land area of Chinese family farms is 10.12 hm 2 , nearly 27 times that of small farming households. Because of the complex regional geographical and climatic conditions in China, the number and scale of family farms vary greatly from place to place. The number of family farms in southern China is generally higher than that in northern China, and the scale of family farms in northern China is generally larger than that in southern China [25]. However, the scale of the Chinese family farm is limited by the labor capacity of family members; for example, the standard of 100-150 mu land area for family farms in Songjiang, Shanghai, is based mainly on the labor income and management ability. Rising land rents also hinder the scale expansion of Chinese family farms. Because of the land fragmentation, of the 2,873,900 hm 2 of arable land operated by family farms in China, 2,124,600 hm 2 come from land transfer, accounting for 73.9% [25]. Therefore, the emerging family farms in China are generally small and medium-sized.
It is well recognized that the family farm of moderate scale, which has been widely developed in China, is the appropriate development path for Chinese agriculture [26]. However, some studies suggested that as the number of family farms in China grew rapidly, most family farms focused on short-term profits rather than long-term strategies characterized by greater demands for farm management. For example, Jiang (2017) indicated that despite the rapid growth in the number of family farms, the overall performance is low and unbalanced after evaluating the performance of family farms in Gansu province [27].  concluded that the overall performance is moderately upward, economic performance is good, while innovation performance is low after evaluating the growth performance of 453 grain family farms selected from northern China [28].  analyzed the performance of 670 family farms in 30 provinces, municipalities, and autonomous regions in China. The result shows that Chinese family farms are still in the rapid development phase with huge potential for improvement, facing the main problems of financing difficulties and lack of talents [29]. As it is commonly recognized that Chinese family farms have been developing quickly, while long-term strategies, balanced performance, and continuous improvement are expected, it is essential to analyze the sustainable performance of the Chinese family farms to provide information to decision makers (farm owners, policy makers, etc.) to adopt specific measures to ensure a high level of competitiveness and the future development potential of Chinese family farms. As such, studies about the sustainable performance of Chinese family farms are meaningful.
This study tries to answer the following questions: Whether the BSC is appropriate for the farm sustainable performance evaluation in the Chinese context? Will the successful adoption of the BSC be limited by the development history, phase, scale and level of family farms, cultural background and institutional factors? What indicators and methods should be used in this evaluation? The abovementioned questions will be resolved by fulfilling the following research aims: (1) This study proposed that the BSC model adapt to the issue of the Chinese family farm performance evaluation and develop a sustainable performance evaluation index system for Chinese family farms, which takes into account key performance indicators considered most relevant. The system designed in this study captured financial performance (member's earning, farm's benefits), market performance (customer relations, market status), internal business process performance (ecological performance, management level), and learning and growth performance (learning ability, innovation ability) to set indicators for analysis; (2) Additionally, we aimed at bringing more knowledge about the current situation of Chinese family farms. To handle it, we used the Fuzzy Comprehensive Evaluation Method to further explore and rank the performance scores of different BSC dimensions, farm types, and regions. We collected questionnaire data from 176 family farms with planting (grain, horticultural crops) and breeding (animal products) in eastern, central, and western Jilin province from January to February 2019 in order to present a comprehensive overview and comparative analysis. This study took as a sample the family farms in Jilin province, a major agricultural province in northeast China. In Jilin province, traditional small farming households are still dominant, but family farms have been emerging rapidly since 2013. Such a context is relevant as many farmers rely on agriculture to survive and are facing more difficult challenges in the new situation. It is a representative sample to study the family farms in undeveloped areas or agricultural areas in China.

Literature Review
In the international literature, some previous studies have attempted to apply the concept of the BSC to agriculture. These attempts existed in Denmark, German, Ireland, New Zealand, Ukraine, the UK, the USA, Italy, and Australia in various agribusiness sectors, including milk and animal production, fruit cooperatives, and food supply chains [4,13,[30][31][32][33][34][35][36][37][38]. Literature on the Balanced Scorecard application in the agricultural sector is not abundant. Most BSC literature focuses on the corporate sector [4]. One critical factor that limited the large-scale adoption of the Balanced Scorecard in agribusiness seemed to be the predominance of small family farms, which rely mostly on physical work and lack formal strategic management [39]. However, the BSC is not excluded as a valuable tool for family farm business, and several studies have demonstrated that the application of the Balanced Scorecard in small and medium family farms could be successful [4,[30][31][32][33][34]36]. More than that, increased complexities of the farming business resulting from increased demand for product quality, food safety, and sustainable development have put forward new requirements for strategic management at farm level [4,30,39,40].
However, the current application of the BSC in agriculture was mainly in case studies [4,31,32,38,41]. For example, Byrne and Kelly (2004) used the BSC for milk production in six dairy farms in Ireland and developed it by taking the local conditions and structures into account [31]. Shadbolt (2003) applied the BSC in three sample farms in the process of strategy planning, implementing and controlling to identify the areas that enhance or restrict the BSC as a management tool for farming [32].  identified key performance indicators most relevant to farm performance and transferred the BSC concept to three types of cropping farms in Germany [42]. There is a lack of quantitative analysis crossing multiple samples that could provide more information and guidance for farm management requirements.
Regarding the BSC dimensions, Kaplan and Norton (1992) introduced four BSC dimensions-financial, customers, internal business processes, learning and growth [2]. Noell and Lund (2002) indicated that the dimensions of the BSC and the rules of strategic management are basically the same for any size and type of business, and as valid for farming as for any other small business [30]. Some authors claimed that having only four dimensions is a weakness of the BSC. In this regard, they recommended adding additional dimensions such as human resources, people, natural resources, lifestyle, supply chain, innovation processes, and society to the BSC model, according to the characteristics of agriculture and its specific needs [4,34,38,43,44]. To keep the evaluation index system easy to understand and operate, the set of indicators should be limited in number, generally three to seven indicators in each dimension, with the requirement that the selected indicators should reflect the critical performance variables [45]. This paper resolved to keep four BSC dimensions, with some minor adaptations.
Relevant research in the Chinese context is quite limited. Considered as powerful tools for performance evaluation, Data Envelopment Analysis (DEA) and Principal Components Analysis (PCA) are commonly used in the performance evaluation of Chinese family farms. They concentrate on financial measures indicating the comparative relationship between inputs and outputs in farm operation, but motivation factors that are also important features to be considered for farm performance evaluation are not involved in the models mentioned above. Notably, increasingly diverse approaches have been conducted in Chinese family farm performance evaluation in recent years. For example, Liu (2014) selected economic and social development indicators for the performance evaluation of family farms in Songjiang, Shanghai [46]. Li (2015) comprehensively evaluated eight planting family farms in Hunan province using business performance dimensions and development potential dimensions [47]. Jiang (2017) established a performance evaluation index system with economic, technological, social, and ecological performance as the criteria layers [27]. Gao (2017) selected 28 indicators for the growth performance evaluation of grain farms in Shandong, Henan, Hebei, Anhui, and Jiangsu provinces from five dimensions: innovation performance, coordination performance, ecological performance, social performance, and economic performance [28]. Chen (2014) measured the vitality of family farms in Jiangsu province by introducing seven indicators: leadership, strategy, customer and market, measurement analysis and improvement, resources, process management, business results in the Performance Excellence Model [29]. Although the application of the BSC to the Chinese family farm performance evaluation is scarce, the above researches show a trend from focusing on outcome and short-term indicators such as scale and net income to emphasizing motivation factors such as long-term strategy and innovation ability, and this trend also reaffirms the need for the methodology of sustainable performance evaluation of Chinese family farms. Linking smallholder farmers to high value markets is crucial for the Indonesian economic development agenda.
By summarizing and analyzing the existing research, we found that very few adaptations of the BSC to farm performance evaluation so far have been carried out, especially in the Chinese context, and there is little statistical analysis and comparative study, based on a multiple samples survey in the existing BSC-based farm performance research. This study contributes to the literature by creatively introducing the BSC model to the sustainable performance evaluation of Chinese family farms. In doing so, a sustainable performance index system based on the BSC for Chinese family farms was produced, and its suitability was justified empirically on the example of Jilin, China. Since the current research of the BSC in agriculture was mainly in case studies, we conducted a multiple-sample survey and statistical procedures for analyzing farm performance, and the approach used in this study allows comparative analysis between different BSC dimensions, industrial types, and regions. After the limitations of family farm sustainable performance evaluation methodologies in previous research were identified, the major academic contributions of this study can be summarized as follows: First, findings suggest that the BSC can be used as an evaluation tool in farm business, although most BSC literature in agribusiness focuses on the corporate sector [39], and prove that the BSC is applicable to the farm sustainable performance evaluation in the Chinese context. Moreover, successful adoption of the BSC will be limited by the development history, phase, scale, and level of family farms, and influenced by cultural and institutional factors. The key is selecting suitable indicators for the evaluation index system while considering the particularities of market, resources, management, and personnel.
Second, this empirical study contributes to intuitively understanding the overall sustainability, balanced performance, strengths, and weak links of Chinese family farms by calculating and comparing the evaluation score of each indicator. The findings reveal some urgent demands for improving the sustainability of the surveyed family farms.
Finally, by engaging in industrial comparison, the findings contribute to identifying the performance characteristics of industrial types, or enterprise combinations, on farms. Results suggest that a combination of planting and breeding can be a future trend for Chinese family farms. The obvious imbalance in the internal business process dimension is a major constraint for grain farms adapting to the new situation. Findings also suggest that farm performance depends more on management than on external environment by revealing the weak connection between the sustainable performance evaluation results and regional economic (or natural) conditions.

Data Source
This study used a quantitative approach to achieve the research aim. A structured questionnaire survey was used as the primary data collection that was carried out from January to February 2019 (see Supplementary Material for the questionnaire). A total of 187 questionnaires were issued and 156 valid questionnaires were selected. As shown in Table 1, the survey was conducted in three regions of Jilin province: central (51.28%), eastern (23.08%), and western (25.64%). The central area plays an increasingly important role in the economic development of Jilin province, while eastern and western areas lag behind. Therefore, three regions were selected to better reflect the overall level and the regional differences of family farms in Jilin province. The segmentation of industrial types can help us know farm performance more clearly, so this paper used family farms from four industries, grain (39.74%), horticultural crops (20.51%), breeding (16.67%), and the combination of planting and breeding (23.08%). The Association of Rural Professional Cooperatives in Jilin province helped us select family farms by regions, scales, and industry types, and sent out the electronic questionnaire to selected family farmers to fill out. The questionnaire consisted of three parts-farmer demographics, farm characteristics, and sustainable performance evaluation. The sustainable performance evaluation based on the BSC constitutes the core part of the whole questionnaire (specific questions shown in Table 2).

Participants
(1) Surveyed Farms Had Moderate Scale and Normal Farm Businesses. In the 156 surveyed farms, farms with annual sales revenue of less than 500,000 accounted for the highest proportion (42.31%), followed by 500,000 to 1 million (29.49%), 1 million to 5 million (21.79%), and 5 million and above (6.41%). The surveyed family farms generally had good profitability. Only 12 farms suffered losses, 32 farms of profits within RMB 100,000, 46 farms of profits between RMB 100,000 and RMB 300,000, 42 farms of profits between RMB 300,000 and RMB 500,000, and 24 farms of profits above RMB 500,000. Nearly 70% surveyed farms had annual sales of less than RMB 1 million, and 76.92% farms with profits ranging from RMB 0 to 500,000. Therefore, the investigated farms in this paper had moderate scale and normal farm businesses, which could be used as samples of Chinese emerging family farms for the sustainable performance evaluation [48]. (2) Surveyed Farmers Had the Characteristics of Professionalization and Specialization in Agriculture.

Model Specification
The Fuzzy Comprehensive Evaluation Method was proposed by L. A. Zadeh in 1965 [49]. It is a comprehensive evaluation method based on fuzzy mathematics, which can better solve the fuzzy, difficult to quantify, non-deterministic problems, and make an overall evaluation of things or objects that are restricted by multiple factors. The Fuzzy Comprehensive Evaluation Method has been widely used in various fields of natural science and social science in recent years. It can be an effective method to evaluate the sustainable performance of family farms due to its multiple quantitative and qualitative measurement indicators, different subordination and level of each indicator to the performance. The fuzzy evaluation model aims at providing a fuzzy mapping between each evaluation indicator and the set of categorical appraisal grades. The idea is to define fuzzy sets for the evaluation indicators. Using the fuzzy comprehensive evaluation model involves the following six steps: • Step 1: Determining the set of evaluating indicators Select and determine the first-level indicators Step 2: Determining the set of appraisal grades A set of appraisal grades can be seen as a vector . . , v m in which m represents the number of levels in the appraisal set. In this paper the number is 5, the set of appraisal grades is V = {very poor, poor, moderate, good, excellent}.

•
Step 3: Establishing the fuzzy mapping matrix After the set of appraisal grades V is determined, the next step is to determine the membership degree of each evaluation indicator c i to the appraisal vector V, then the fuzzy mapping matrix is obtained: where r ij means the membership degree of an evaluation indicator c i to the fuzzy subsets v i in the evaluation grades set V, and n represents the number of levels in the appraisal. The goal of the evaluation process is to provide a mapping from U (mentioned in Step 5) to V. The performance of each evaluation indicator c i is reflected by fuzzy vector r i , which is the single-factor evaluation matrix and can be regarded as a fuzzy mapping between indicators set U and evaluation set V.

•
Step 4: Determining the weight of each evaluation indicator For m evaluation indicators, the weight can be shown by the vector W = w c1, w c2, . . . , w cm , w ci represents the weight of each second-level indicator, w ci >0, i=m j=n w c ij = 1. Weights have a great impact on the final evaluation results. In this paper, the Analytic Hierarchy Process (AHP) was used to determine the weight of each indicator as described in the next subsection.

•
Step 5: Getting the fuzzy comprehensive evaluation result vector Fuzzy weights vector W and fuzzy mapping matrix R are combined to get fuzzy evaluation result vector U for each indicator (using multiplication). The fuzzy evaluation model is: The most commonly used method to quantify evaluation results in practice is the Maximum Membership Principle, that is, the evaluation grade corresponding to the maximum membership of each indicator is the evaluation result of this indicator. In the fuzzy comprehensive evaluation results vector, if u r = max 1≤ j≤n u j , then the evaluated object is usually considered to belong to the r grade.

•
Step 6: Determining system scores The Maximum Membership Principle cannot utilize all the information of the fuzzy grades vector, which may lead to a large deviation. After the comprehensive evaluation result vector is determined, the system score can be calculated for comparison using formula N = U × S T . N is the total score of the system, and S is the grade score of the corresponding factor in the appraisal grades set V. In this paper, the quantitative set for the appraisal comment set V is S = 1 2 3 4 5 .

Indicator Description
Balanced Scorecard (BSC) indicators in the farm business were transplanted in general from the framework of traditional industrial enterprises. The industrial enterprise management system is more established, and the relevant data can be obtained easily; however, the availability of data must be considered before selecting indicators for farms. Developing a technically unified and standardized BSC evaluation system for the farm business is not an easy task. The BSC model was applied and modified to the farm business through case study and Delphi study by   [13] [43]. However, these indicators set for western farms are not suitable for the performance evaluation of Chinese farm business, and the data that can fully reflect the farm performance are almost impossible to obtain. The selection of indicators in this study was mainly by a literature review of the Chinese family farm performance evaluation and expert advice. This paper kept the four BSC dimensions as first-level indicators and 19 representative and operable second-level indicators, while considering the market, resources, management, and personnel characteristics of Chinese family farms. The sustainable performance evaluation index system for Chinese emerging family farms was set up as shown in Table 2.

Financial Performance
Financial performance can be measured by member's earnings and farm's benefits [50,51]. Member's earnings is mainly reflected by the per capita net income of family members [28,50,52], which has a great influence on the working enthusiasm and stability of family members. Measuring the economic performance of family farms should be a combination of benefits and debt risk [28]. The farm's benefits can be measured by average sales growth rate, average profit growth rate, and average liability-asset ratio in the recent three years [28,47,52]. The first two reflect the family farm's profitability. The liability-asset ratio is an important indicator of capital structure reflecting the farm's solvency and financial risk [28], affecting and determining the farm's sustainable performance.

Market Performance
Market performance reflects farm performance in maintaining and expanding business. It can be measured by customer relationships and market status [50,51]. Customer relationships include relationships with agricultural brokers, relationships with other agricultural business entities, and customer satisfaction [28]. Though behaving entrepreneurially, Chinese farmers are more used to production and are not good at sales. Most family farms rely mostly on the local wholesale market and the purchase and trafficking of rural brokers for sales; some products are handed over to agricultural companies, but are usually unable to sell at high prices due to the lack of pricing power. Some products are transferred to agricultural cooperatives, but the transaction volume is usually small and cannot meet the needs of selling large amounts of agricultural products. As such, middlemen are the main marketing channels for most family farms in China, and it is difficult to collect customer data and conduct customer research. The relevant questionnaires were filled out by the surveyed farmers themselves. Market status can be measured by bargaining power in the supply chain and branding level [50,51], which reflect the farm's market competitiveness and influences.

Internal Business Process Performance
Internal business process performance can be measured by ecological performance [28] and management level [50]. Ecological performance is a crucial factor for sustainable agricultural development, which can be evaluated by the ratio of organic agricultural products, frequency of waste recycling, and pollution-free treatments [53]. The management level is an important metric to evaluate modern farms, which can be reflected by number of farm regulations, length of the land contract, and number of registered trademarks [50]. Farm regulations may include regulations for standardized production, financial management, job responsibility, employee management, etc. [50]. The length of the land contract affects the farm's long-term investment and daily management. A registered trademark requires farmers to pay more attention to product quality and customer service, reflecting the farm's long-term development plan.

Learning and Growth Performance
Learning and growth performance determines the sustainable development potential of the farm, which includes learning ability and innovation ability. Learning ability can be measured by the farmer's education and farmer's age [51]. The influence of the farmer's education on farm efficiency and outcomes has been proven in the literature, since higher education leads to a stronger ability to adopt best management practice and new technologies [54][55][56]. In theory, at higher ages, low legacy beliefs impede transformational and transactional leadership behaviors and boost passive avoidant leadership behaviors [57], thus age is expected to have a negative correlation with the farmer's learning ability. Agricultural innovation means the emergence of new products, new processes, and new organizational forms in agriculture [58]. Therefore, the innovation of family farms involves product and technology innovation, and organizational system innovation [28]. Product and technology innovation is the key to the success of family farm operations, which is reflected mainly by the adoption of new technologies and new varieties [59,60]. On the other hand, family farms adapt to the changing external environment through organizational system innovation, and innovative financing channels and marketing channels are essential components of organizational system innovation.

Method for Determining Indicator Weight
Indicator weight as an important part of the performance evaluation refers to the importance coefficient of each indicator in the whole indicator system. It reflects the contribution of each indicator to the overall performance. In this paper, Analytic Hierarchy Process (AHP) was used to determine indicator weights. The Analytic Hierarchy Process (AHP) is a theory of measurement that relies on the judgments of experts to derive priority scales through pairwise comparisons [61]. The analyzed problem is divided into three levels: target layer, criteria layer, and solution layer, then the importance of the decision-making scheme is determined by means of pairwise comparison, that is, the weight of the decision-making scheme relative to the importance of the target layer. The main steps are as follows:

•
Step 1: Structuring a hierarchy of the criteria based on the evaluated indicators Determine the overall objective for the problem and list factors that affect the objective. In this research, the sustainable performance of Chinese family farms is the target level A, which is followed by first-level indicators layer B and second-level indicators layer C (see Table 2).

•
Step 2: Constructing a pairwise comparison matrix A comparison matrix is the important basis for calculating the weights of the indicators. The expert will be asked to rate the relative importance of each factor. If there are n evaluation factors, the importance intensity of factor i over factor j can be represented by a ci . A pairwise comparison matrix A is as follows: The comparative importance of each indicator in the matrix is from 1 to 9, and the reciprocal corresponds to the relative unimportance degree (see Table 3). Table 3. Explanation of the two factors comparison scale in AHP.

1
Comparing the two elements, and the element i is the same as the element j

3
Comparing the two elements, the element i is slightly more important than the element j 5 Comparing the two elements, the element i is obviously more important than the element j 7 Comparing the two elements, the element i is strongly more important than the element j 9 Comparing the two elements, the element i is absolutely more important than the element j 2,4,6,8 Intermediate value of the above two adjacent judgments reciprocal When the importance of factor i for factor j is a i j , then the importance of factor j for factor i is a ji=

a i j
This study used the Experts Grading Method to help determine the relative importance of indicators. There were eight experts involved in this Delphi study, including one deputy director of the Agricultural Committee of Jilin province, one secretary of the Farmers' Cooperative Federation of Jilin province, three family farm owners, and three professors in agribusiness in Jilin Agricultural University. The experts were asked to rate the relative importance of each indicator, then the average scores from the eight experts were calculated to construct the comparative judgment matrices. After the third round of questionnaire collection, the judgment matrices had satisfactory consistency and the final result was obtained. The comparative judgment matrices of the four first-level indicators to the target layer A and 19 second-level indicators to corresponding first-level indicators were designed in Table 4.
The weight vectors can be obtained from the above matrices by normalizing the vectors in each column and averaging over the rows of the resulting matrices.

•
Step 4: Checking the consistency of the judgment The consistency test aims at reducing the impact of subjective factors. The test coefficient CR = CI/RI, CI = (λ max −n) /(n−1), RI is the average random consistency indicator of the judgment matrix, which can be found in the relevant numerical tool table. When CR ≤ 0.1, the judgment matrix has satisfactory consistency, if not, the judgment matrix needs to be re-scored.

Weights and Consistency Test of the First-Level Indicators to the Target Layer
We used MATLAB (R2020a Version, MathWorks. Inc, Natick, Massachusetts, USA) to calculate the maximum eigenvalue of this comparative judgment matrix λ  Table 5). Table 5. Weights and consistency test of the first-level indicators to the target layer.

Weights and Consistency Test of Second-Level Indicators to First-Level Indicators
Weight W c i , the maximum eigenvalues λ (2) max(b i ) and the consistency ratios CR (2) b i are shown in Table 6: Table 6. Weights and consistency test of the second-level indicators to the first-level indicators.

Model Analysis
In the four first-level indicators, financial performance and market performance are the most important dimensions for the sustainable performance evaluation of Chinese family farms (W b1 = 0.300, W b2 = 0.353), followed by internal business process performance (W b3 = 0.188), and learning and growth performance (W b4 = 0.159). The consistency ratio of the first-level indicators layer to the target layer is CR (1) = 0.0614, the consistency ratio of the four groups of second-level indicators layer to the first-level indicators layer CR (2) b4 = 0.0018, and the consistency ratio of the second-level indicators layer to the target layer CR (3) = 0.03. All the above consistency ratios are less than 0.1, and all judgment matrices in the model passed the consistency test. The comparative judgment matrices were reasonably constructed, and the overall ranking of the hierarchy passed the consistency test. The model was established with high reliability and can be used for analysis.

Fuzzy Evaluation Scores of Indicators
The 19 second-level indicators were evaluated by the investigated family farmers (see Table 2 for the specific items). The number and proportion of each choice for each second-level indicator are listed in Table 7. Taking   The fuzzy evaluation matrices of the four first-level indicators were obtained by normalizing the survey data in Table 7 as below: and the weights of the 19 second-level indicators to corresponding first-level indicators obtained from the above AHP method in Table 6 are as shown below:

Scores of First-Level Indicators
According to the Maximum Membership Principle, the maximum membership degree of financial performance is "good" (MAX(U b 1 ) = 0.357). In this paper, the quantitative set for the appraisal comment set V is S = 1 2 3 4 5 , then we calculated financial performance score N b1 = R b 1 × S T = (0.090, 0.115, 0.295, 0.397, 0.103) · (1, 2, 3, 4, 5) T = 3.421. The financial performance of the investigated family farms is "above moderate." The same method was used for the other three BSC dimensions. The maximum membership degree of market performance is "good" (MAX(U b 2 ) = 0.286). The market performance score is N b2 = R b2 × S T = (0.128, 0.066, 0.250, 0.286, 0.270) · (1, 2, 3, 4, 5) T = 3.504. The market performance of the investigated family farms is "above moderate." The maximum membership degree of internal business process performance is "very poor" (MAX(U b 3 ) = 0.261). The internal business process performance score is N b3 = R b3 × S T = (0.261, 0.162, 0.177, 0.150, 0.251) · (1, 2, 3, 4, 5) T = 2.971. The internal business process performance of the investigated family farms is at "moderate" level. The maximum membership degree of learning and growth performance is "very poor" (MAX(U b 4 ) = 0.269). The learning and growth performance score is N b4 = R b4 × S T = (0.261, 0.162, 0.177, 0.150, 0.251) · (1, 2, 3, 4, 5) T = 2.783. The learning and growth performance of the investigated family farms is "below moderate." On the whole, the average score of the four BSC dimensions is 3.264, indicating that the sustainable performance of the surveyed family farms in Jilin province is in the "slightly above moderate" level.
As shown in Table 8, the overall sustainable performance of the investigated family farms in Jilin province is in the slightly above moderate level (3.264), in which market performance (3.504) and financial performance (3.421) are above moderate, while internal business process performance (2.971) is moderate and learning and growth performance (2.783) is below moderate. The surveyed farms performed better in outcome indicators than in driving indicators. As Chinese family farms are still in their infancy, financial and market performance will be given more attention in order to survive. However, there is still a long way to go in the institutionalization and standardization of farm management, especially in the innovation capabilities. It is notable that the scores of the four first-level indicators are quite close, and the difference between the highest and lowest score is around 0.7. Although there is great potential for improvement in learning, innovation and internal process management, the overall BSC performance of the investigated family farms is relatively balanced.

Scores of Second-Level Indicators
After calculating the scores of the four first-level indicators, we used the above formula to calculate the scores of 19 second-level indicators. Scores of the four second-level indicators in the financial dimension were calculated as below: As shown in Table 9, the four second-level indicators in the financial dimension scored between 3 and 4 points. The ranking of the four indicators is: The average liability-asset ratio in recent three years c 4 ; The per capita net income of family members c 1 ; The average profit growth rate in recent three years c 3 ; The average sales growth rate in recent three years c 2 . The surveyed family farms had relatively good solvency; in line with the data provided by the China Economy Trends Research Institute in 2016, the overall liability-asset ratio of Chinese family farms was 10.6% [62]. The family farms played a certain role in increasing farmers' incomes, however, the sales growth rate and profit growth rates of the surveyed farms in 2016-2018 were not satisfying. The current price formation mechanism reform of important agricultural products led to a continuous decline in food prices, which severely reduced the profit margins of, especially, grain farms [63]. In addition, the rising prices of agricultural production inputs resulted in a substantial increase in agricultural production costs. In 2015, the total value of family farm production inputs was RMB 58.982 billion, and the average annual input for each family farm was RMB 172,000. Farms cannot afford such a high cost without policy support and subsidies [25]. Average sales growth rate in recent years c 2 3.167 Average profit growth rate in recent three years c 3 3.308 Average liability-asset ratio in recent three years c 4 3.628 Market performance b 2 3.504 Relationship with agricultural broker c 5 4.026 Relationships with other agricultural business entities c 6 3.923 Customer satisfaction c 7 4.128 Bargaining power in supply chain c 8 3.795 Branding level c 9 1.936 Internal business process performance b 3

2.971
Pollution-free, green and organic agricultural products ratio c 10 2.769 Ratio of wastes recycling and pollution-free treatment c 11 3.218 Number of farm regulations c 12 2.962 Length of land contract c 13 3.782 Number of registered trademarks c 14 2.026 Learning and growth performance b 4

2.783
Farmer's education c 15 2.962 Farmer' age c 16 3.590 Adopting new technology or new variety c 17 4.308 Diversification of marketing channels c 18 1.897 Diversification of financing channels c 19 1.923 The second-level indicators in the market dimension scored slightly higher than financial performance. The ranking of the five indicators is: Relationship with agricultural broker c 5 ; Customer satisfaction on products c 7 ; Relationship with other agricultural business entities c 6 ; Bargaining power in supply chain c 8 ; Branding level c 9 . In China, farmers rely very much on rural market intermediaries, such as agricultural product brokers, cooperatives, and agricultural enterprises, so their business relations are relatively stable, which may lead to higher customer satisfaction. However, as the producers in the downstream of the supply chain, the surveyed family farms considered their bargaining power not strong in the transactions with rural market intermediaries. Additionally, the branding level of the surveyed farms was quite low. It should be noted that the slightly higher scores of indicators c 5 , c 7 and c 6 are the perceptions of the farmers themselves.
Internal business process performance was slightly below moderate. The ranking of the five second-level indicators is: Length of land contract c 13 ; Ratio of wastes recycling and pollution-free treatment c 11 ; Number of farm regulations c 12 ; Pollution-free, green and organic agricultural products ratio c 10 ; Number of registered trademarks c 14 . The result shows that the investigated farms had relatively long-term land tenure and had begun to pay attention to farm management and sustainable development. The low scores of c 10 and c 14 can be seen as a response to the market reality in China. However, with the development of China's economy, Chinese consumers' requirements for food safety and quality have increased accordingly. People are more inclined to consume products with well-known brands and quality certifications. However, at present, the proportion of registered trademarks owned by family farms in China is very low. Among the 342,000 family farms, only 11,000 family farms have registered trademarks, accounting for 3.34%. Most Chinese family farms do not have their own brands. The number of family farms that have agricultural product quality certification is even more limited, only 5273, accounting for 1.54% [25]. The organic production and branding of agricultural products are becoming the realistic needs of modern agricultural development in China.
Learning and growth performance ranked lowest, and the ranking of indicators from high to low is: Adopting new technology or new variety c 17 ; Farmer's age c 16 ; Farmer's education c 15 ; Diversification of marketing channels c 19 ; Diversification of financing channels c 18 . The scores of c 15 and c 16 reflect the moderate learning ability of the surveyed farmers. Farmers performed better in adopting new technology and variety, and they have benefited from the Chinese government's continuous efforts in agricultural technology extension and professional farmer training in recent years. Farmers lacked the ability to market and finance from multiple sources. It suggests that the surveyed farmer innovation capability is much better in introducing new products and technology than in adopting new processes and new organizational forms.

Industrial Differences
In the four farm types, farms combining planting and breeding ranked first (3.409), followed by horticultural crops (3.336) and grain (3.250), with animal products scoring lowest (3.006).
The one-way ANOVA result in Table 10 shows that the differences in terms of internal business process performance and overall performance are significant (p-values are less than 0.05). Specifically, compared to the other three farm types, farms combining planting and breeding have a better BSC performance, especially in the internal business process dimension. The combination of planting and breeding is an ecological agricultural mode that takes manure and organic matter produced by breeding as the organic fertilize for planting, and the crops produced by planting as the food for breeding. The combination of planting and breeding has obvious advantages in reducing agricultural wastes and environmental pollution, promoting organic production, learning and innovation, thus improving customer satisfaction and the farm's financial performance correspondingly.
Horticulture farms perform much better in the market dimension due to the technical and market characteristics. Fruits and vegetables are quite seasonal and perishable, not suitable for long-term storage, and they need to find a market as soon as possible. In addition, because of the frequent market transactions within a year, fruits and vegetables rely greatly on the rural intermediary organizations and more attention is needed to maintain customer relations and brand image.
Breeding farms lag behind in all BSC dimensions. The evaluation score is basically consistent with the reality of the breeding industry in Jilin province, which has the problems of extensive operation, poor management, and serious pollution. For sustainable development, transforming from single breeding mode to a combination of planting and breeding can be a future trend.
In regard to grain farms, it is notable that the internal business process performance shows an obvious imbalance in the four BSC dimensions. Jilin province is a major corn planting area in China. The rapid growth of China's corn processing industry in the past 20 years has stimulated corn production and consumption, increased corn prices and grain farmers' incomes. As a result, a single planting structure was formed with limited ecological emphasis and, possibly, less innovative management of the grain industry in Jilin province.

Regional Difference
There is no significant difference in the overall sustainable performance between family farms in eastern, central, and western Jilin with average scores of 3.370, 3.216, and 3.263, respectively. The overall ranking of the three regions is: (1) eastern, (2) western, (3) central.
The calculation results of the one-way ANOVA in Table 11 show that there are significant differences in terms of financial performance and learning and growth performance (p-values are less than 0.05). Specifically, the performance of the four BSC dimensions of the surveyed farms in eastern Jilin is relatively balanced and the overall performance level is slightly higher, especially in the learning and growth dimension, reflecting better sustainable development capability.
The BSC performance of the surveyed family farms in the western region is unbalanced, and farms perform much better in financial indicators than in non-financial indicators, specifically weak in the learning and growth dimension, indicating the lack of long-term goals for sustainable development.
The evaluation result of the central area is quite consistent with that of Jilin province. Despite the relatively better economic and natural conditions in the central area, there seems to be a weak connection between sustainable performance and external environment, indicating that farm performance depends more on management than on external environment.

Discussion
The produced family farm sustainable performance index system based on the BSC and the justification of its suitability for the Chinese family farm sustainable performance evaluation included the following: (1) Indicators most commonly used in the literature to assess farm performance in view of the particularity of Chinese family farms were determined and used to develop the Chinese family farm sustainable performance index system. (2) The indicators of Chinese family farm sustainable performance were evaluated, which helped to determine the types of farm sustainable performance (low, moderate or strong) and the farm's potential of growth and development. (3) The evaluation of family farm sustainable performance consisted of three logical constructs: each indicator weight in the overall system was determined using AHP; the score of each first-level and second-level indicator was calculated and ranked using the Fuzzy Comprehensive Evaluation model; differences between four BSC dimensions, four industrial types, and three regions were compared and analyzed.
Several findings were obtained by analyzing the evaluation results of the surveyed farms sustainable performance: (a) Based on the evaluation results of our selected family farms, we concluded that the overall sustainable performance of the surveyed family farms in Jilin province is in the slightly above moderate level (3.264). Surveyed farms performed better in outcome indicators (financial dimension, market dimension) than in driving indicators (internal business process dimension, learning and growth dimension). The evaluation results in this paper are basically in line with the existing literature about farm performance in China mentioned at the beginning of this paper, and consistent with the current situation of Chinese family farms. This empirical study justified that, as a mature performance management tool for industrial enterprises, the BSC can be used in family farm performance evaluation and is also appropriate for the sustainable performance evaluation of emerging family farms in the Chinese context with the selection of suitable indicators reflecting the particularities of market, resources, management, and personnel. (b) The sustainable development of family farms depends on a balance of all BSC dimensions.
The first-level indicator ranking order regarding Fuzzy Comprehensive Evaluation results is: market performance (3.504), financial performance (3.421), internal business process performance (2.971), and learning and growth performance (2.783). Market performance scored higher than others and ranked first among the four BSC dimensions. This may be because family farms in China rely on rural market intermediary organizations for sales. It should also be noted that four of the five second-level indicators of the market dimension are subjective, and the farmer's self-perception may be not accurate, as it is influenced by cultural, psychological, and institutional factors. The result would be different if we surveyed customers, enterprises, and other agricultural business entities. We also noticed that there are some differences in the scores of the four first-level indicators, but they are not significant, suggesting that the performance of the four BSC dimensions is not quite unbalanced. Although there is great potential for improvement in learning, innovation and internal process management, the Chinese family farms have already begun to pay attention to farm management and sustainable development. There is an urgent demand for improving farmers' ability to market and finance from multiple sources as well as increasing their awareness of brands and registered trademarks. (d) Industrial differences exist in farms' sustainable performance. Farms combining planting and breeding have better sustainability compared with the other three farm types. Breeding farms show poor performance across all BSC dimensions. For sustainable development, transforming from single breeding mode to a combination of planting and breeding could be a future trend. Notably, grain farms show an obvious imbalance in the internal business process dimension. Facing the pressure of price formation mechanism reform of important agricultural products in China, grain farms need to enhance internal business management and formulate long-term development strategies to meet the requirements of the new situation. On the basis of identifying the performance characteristics of industrial types or enterprise combinations on farms, precise support strategies and subsidy policies can be formulated. (e) The overall sustainable performance of family farms in eastern, central, and western Jilin provinces is quite close, and it does not seem meaningful to analyze the regional differences of the surveyed family farms' sustainable performance. However, the weak connection between the sustainable performance evaluation results and regional economic (or natural) conditions indicates that farm performance depends more on management than on external environment. Additionally, the unbalanced performance of farms in western Jilin province requires them to overcome the shortcomings of having purely financial and economic goals of profit maximization and to achieve sustainable profit as a longer-term objective.
Some limitations exist in this work, so its findings must be interpreted with caution. First, data are not generalized in this paper. We collected only the farm data of 2018, rather than longitudinal data, due to research conditions constraints; specifically, indicators of the market, internal business process, and learning and growth dimensions were all selected in the static time section. The sustainable performance of the surveyed farms cannot be reflected dynamically, which is also a deficiency of the Balanced Scorecard itself. In addition, the distribution of the sample family farms in each region is uneven in the four industries due to the investigation difficulty; therefore, the empirical results can not accurately reflect the influence of regional conditions on the development of family farms, but still can provide some valuable information. However, the methodology can be generalized. The produced family farm sustainable performance index system based on the BSC may provide a reference framework to study family farms in other places of similar context, and the methods used in this study, such as multiple samples survey and statistical procedures, can be used in other contexts for analyzing farm performance. Second, due to the difficulty in obtaining each farm's internal and external information, the data in this paper were from the questionnaires filled out by the farmers themselves; therefore, the conclusion was based on the farmers' self-assessment and its correctness needs to be further verified in future research. In fact, sustainable development not only involves the interests of investors, customers, and employees, but also the interests of government, the public, creditors, and other important stakeholders, which is also a deficiency of the Balanced Scorecard. Finally, considering the development history, phase, scale, and level of Chinese family farms, the second-level indicators for the four BSC dimensions are relatively fewer and simpler in order to ensure the availability of relevant data. The sustainable performance evaluation index system should be a complex system. The more comprehensive and systematic the indicators we select, the more objective and accurate the evaluation result will be. In essence, the successful adoption of the BSC will be limited by the development history, phase, scale, and level of family farms, and influenced by cultural and institutional factors. The key is selecting suitable indicators for the evaluation index system while considering the particularities of market, resources, management, and personnel. As these family farms progress, more diverse evaluation indicators can be selected for the sustainable performance evaluation.
Future research can address these limitations and extend our findings. Primary among these opportunities is to collect longitudinal data to conduct a trend analysis of family farm sustainable performance and expand the sample size to ensure that the sample family farms of each industry type are evenly distributed in each surveyed region. In particular, we hope this study encourages future research to engage in further exploration of the important influence of scale [64] and management models [65] on the sustainability of family farms, as well as test the evaluation index we created across a greater geographical scope and different industry contexts. For example, studies of grain farm businesses crossing bulk grains, special grains, and miscellaneous grains will challenge and extend the results of this study. Future research can also monitor the sustainable performance of family farms from other stakeholder groups that might negatively or positively affect the sustainability of family farms in terms of financial, market, or learning and growth performance. The Performance Prism system can be an excellent framework in future studies because it monitors all the key performance stakeholders and is not limited to evaluating the performance from the perspective of stakeholder satisfaction. What is more, the stakeholders' contribution to the performance will also be taken into account [66]. Recent studies have suggested that we can also expect an exploration of the Balanced Scorecard in agricultural cooperatives, based on interviews with both cooperative staff and relevant industry stakeholders [35] in which social performance should be included in the index system [51].