Linking Ecosystem Services and the SDGs to Farm-Level Assessment Tools and Models

: A number of tools and models have been developed to assess farm-level sustainability. However, it is unclear how well they potentially incorporate ecosystem services (ES), or how they may contribute to attaining the United Nations Sustainable Development Goals (SDGs). Understanding how farm-level assessment tools and models converge on these new paradigms of sustainability is important for drawing comparison on sustainability performances of farming systems, conducting meta-analyses and upscaling local responses to global driving forces. In this study, a coverage analysis was performed for several farm-level sustainability assessment (SA) tools (SAFA, RISE, KSNL, DLG) and models (MODAM, MONICA, APSIM), in regard to their potential for incorporating ES and contribution to attaining the SDGs. Lists of agricultural-relevant CICES classes and SDG targets were compiled and matched against the indicators of the tools and models. The results showed that SAFA possessed the most comprehensive coverage of ES and SDGs, followed by RISE and KSNL. In comparison to models, SA tools were observed to have a higher degree of potential for covering ES and SDGs, which was attributed to larger and broader indicators sets. However, this study also suggested that, overall, current tools and models do not su ﬃ ciently articulate the concept of ecosystem services.


Introduction
Agriculture provides a diverse range of benefits to human well-being. Besides primarily producing food, fodder, fiber, and fuel, agriculture plays a crucial role, for example, in carbon storage, nutrient cycling, hydrological flow regulation, biodiversity conservation, as well as sustaining rural economies and cultural heritage. In this sense, agricultural systems can be considered multifunctional, as they fulfill several purposes simultaneously [1,2]. However, agricultural management often generates trade-offs between functions, e.g., maximization of biomass production versus conserving biodiversity, resulting in outcomes that are detrimental to long-term environment and socio-economic sustainability [3,4]. To promote informed decisions and sustainable agricultural management, integrated and systems-based approaches are needed in science, policy and practice [5].
In agricultural research and policy, the ecosystem service (ES) concept is increasingly used as an integrative framework [6,7] to demonstrate the benefit of nature to human well-being [8]. Derived from biophysical processes and functions, ES produce benefits and values that are used by humans [9]. Within the highly managed environment of agricultural systems, agricultural ES are the product of the coupled interaction between agricultural activity and the ecosystems functions in which they are embedded [10]. Agricultural ES include, among others: the provision of biomass, regulation of hydrological cycles, sequestration of carbon, maintenance of pollinators, and cultural services related to tourism and landscape attractiveness.
At the level of international policy and debate, the United Nations Sustainable Development Goals (SDGs) [11] acknowledge the importance of sustainably managing ecosystems (e.g., SDG 6: Clean Water, SDG 13: Climate Action, SDG 14: Life Below Water, SDG 15: Life on Land). Although the SDGs do not specifically mention agricultural ES, sustainable agriculture is seen as a prerequisite to sustainable global development and eliminating hunger (SDG 2: Zero hunger) [12]. Taken together, sustainably managing agricultural ES is an integral part of achieving the SDGs [13,14]. Similar sentiments are reflected in the Farm to Fork Strategy of the European New Green Deal [15] and many other national strategies and policies. However, the challenge still remains as to how to translate global objectives to local action, while simultaneously considering the site-specific characteristics of local agricultural sustainability [16]. Consequently, improving agriculture sustainability begins first with understanding how decisions on the farm-level can directly and indirectly impact ecological and socio-economic systems.
A large number of approaches have been developed for assessing agricultural sustainability, which has resulted in a variety of farm-level sustainability assessment (SA) tools and models [16,17]. Farm level SA tools and models are designed with the intent of guiding agricultural management toward sustainability. Farm-level tools are typically used by practitioners and consultants for comprehensive, ex-post assessment and strategic planning, whereas models are mainly used within research for making anticipatory simulations, usually with a narrower range of sustainability objectives. Differences in purpose and approach between SA tools and models substantially influences how they individually articulate sustainability performance of farm management [18,19]. Specifically, differences in thematic scope, e.g., the range of environmental, social and economic topics covered by a given SA tool or model, determines how well they can potentially incorporate ES in their assessment. Although many farm-level tools and models are not intended to explicitly account for ES and SDGS, the notion that their thematic coverage can implicitly cover ES and SDGs should not be excluded from consideration. It is interesting, therefore, to evaluate the potential of farm-level SA tools and models, to incorporate ES and contribute to attaining the SDGs by carefully reviewing their methodologies.
The focus of this study is twofold: the first, to assess a group of well-established, farm-level SA tools (SAFA, RISE, KSNL, DLG) and models (MODAM, MONICA, APSIM), according to their potential for covering agriculture-related ES; the second, to evaluate how these SA tools and models support the realization of the SDGs. Accordingly, two objectives are defined: (1) review and catalog information on the methodologies of each SA tool and model; (2) review and compare thematic content of each tool (indicators) and model (outputs), concerning the coverage of ecosystem services and SDGs. By elucidating similarities and differences on the thematic scope of each SA tool and model, the overarching goal of this study is to gain insights into how SA methods can be improved to better reflect new paradigms of sustainability within agriculture.
Many studies have engaged in reviewing and comparing agriculture SA tools. Gasparatos et al. [22] categorized SA into monetary, bio-physical and indicator-based approaches. Ness et al. [23] developed a framework that classified SA according to temporal scope (e.g., ex-ante or ex-post assessment) and whether they were product related, integrated, or indicator based. Binder et al. developed a comparison framework to analyze agricultural tools, in regard to normative, systemic and procedural characteristics, categorizing tools according to top-down or bottom-up approaches [24]. Reviewing SA tools according to time, data and budgetary requirements, Merchand et al. [25] identified two types of farm-level assessment: rapid assessment appraisals (RSA) and full assessment appraisals (FSA). De Olde et al. [19] conducted a coverage analysis, to explore similarities and differences of thematic scope between SA tools, and developed a continuum that describes farm-level tools, according to difficulty of implementation versus degree of comprehensiveness [26].
Farm models are similar to SA tools, in that they can be used to help guide farming decisions toward more sustainable management. However, agricultural models themselves are primarily designed for conducting scientific research and, secondarily, for farm-level decision support [27]. Farm models typically combine multiple process-based models to simulate the impacts of management decisions in a prospective (ex-ante) capacity on bio-physical and economic processes at the field, farm and regional scale [28]. Farm models can be divided into static or dynamic approaches, where time is a driving factor of dynamic models, allowing the integration of 'real-time' changes of bio-physical and economic processes in simulations. In contrast, static models use linear programming, which means that they cannot readily capture changes from feedback loops inherent in bio-physical processes [27].
Although existing farm-level SA tools and models tend to place significantly more emphasis on assessing the environmental dimension of sustainability rather than social or economic dimensions [19,29], they have not been designed with the expressed purpose of capturing the value of agricultural ES. However, this does not exclude the possibility that current SA tools and models can implicitly cover ES through their methods and thematic scope (indicators). The same can be said in terms of the potential of SA tools and models to contribute to attaining the SDGs, as their thematic scope may implicitly incorporate many of the same normative values. Therefore, SA results derived from models and tools are extremely valuable, not only for the specific cases of their application, but also in meta-analyses to derive generalized information about farming systems performances. This would allow an upscaling of local responses to global driving forces, such as climate change, global demand dynamics or policies. However, such meta-analysis is only possible, if standardized terminology (ontology) and indicators are used that allow for comparisons across tool and model applications. In this sense, it is necessary to review and compare farm-level SA tools and models, to assess their potential for explicitly and implicitly covering ES, and their contribution to the attainment of the SDGs.
To choose SA tools and models for this study, basic selection criteria had to be fulfilled, such that tools and models should (i) be usable for making management decisions on the farm level, (ii) utilize indicator-based scoring, (iii) use multi-criteria analysis, (iv) been applied within Germany as a test case example, and (v) have sufficient primary literature available in English or German. We chose Germany as a test case, because it is characterized by comparably large scale, low yield gap agriculture, with high education levels of farmers, who are accustomed to the use of tools and models for strategic decision making. Based on these criteria, four farm-level SA tools (SAFA, RISE, KSNL, DLG) and three farm-level models (MODAM, MONICA, APSIM) were selected for this study.
The next sub-sections provide brief overviews of each tool and model involved in the study, outlining general information (see Table 1), and classified according to Schader et al. [20] on key characteristics, including primary purpose, level of assessment, geographical scope, sector, and themes, as well as structure (see Table 2). SAFA is used for the monitoring and self-assessment of enterprises in the food and agricultural sector, focusing on crop and livestock production, as well as forestry and fisheries. The scope of its assessment can also be extended along agricultural supply chains. Developed by the Food and Agricultural Organization (FAO) of the United Nations, its goal is to create an internationally recognized benchmark for agriculture sustainability assessment [30].
SAFA's structure is based on a hierarchical framework of dimensions, themes, subthemes and indicators. At the highest level, there are dimensions, which consist of economic resilience, environmental integrity, good governance, and social well-being. Each dimension is divided into themes, consisting of various sustainability goals, which are further divided into subthemes. Sub-themes describe concrete objectives for sustainability performance, such as reducing greenhouse gas emissions, promoting community investment, and ensuring worker safety. At the lowest level of the hierarchy are indicators, which are used to score the sustainability performance of sub-themes. Scoring is based on a weight and sum aggregation method of indicators into sub-themes.
SAFA provides numerous default indicators and allows for customizable indicator selection. Indicators can be determined by direct measurement, model, or expert opinions. The evaluation of indicators is done via comparison to reference values. Indicators are categorized into three different groups: plan-, practice-and performance-based indicators. When selecting indicators for the assessment, preference is given to performance-based indicators, as they provide quantitative metrics for measuring sustainability performance. When performance-based indicators are not readily available to the assessor, plan-and practice-based indicators are used instead. Default indicators are provided in the SAFA manual and specified in an indicator supplementary document [31]. SAFA does not provide farms with certification, however, if a farm can demonstrate that the assessment was conducted transparently, reference can be made to 'Consistency with the SAFA principles and procedures' [30].

Response-Inducing Sustainability Evaluation (RISE)
RISE is a holistic, indicator-based sustainability tool that focuses on production at the farm level [32]. Developed by the Swiss College of Agricultural, Forest and Food Sciences in 2000, the purpose of RISE is to support farmers in recognizing specific on-farm deficiencies in sustainability performance. Its secondary purpose is to communicate ideas of regional and global sustainability for use at the policy level. Since 2016, RISE has adopted many of the indicators of SAFA to describe in an effort to promote standardization and comparability.
RISE consists of 10 themes and 46 indicator. Themes cover a range of environmental issues, e.g., biodiversity, water usage, and soil quality, as well as social and economic issues, e.g., working conditions and economic viability. Indicators are used to evaluate sustainability at the theme level. RISE allows for selecting indicators based on goal setting, or based on what type of farming enterprise is under review. Data on indicators are collected from regional databases, as well as questionnaire-based interviews with the farmer. Algorithms and thresholds are used to aggregate indicators and generate theme scores. RISE does not provide certification, however the assessment generates a report outlining recommendations to improve sustainability.

Sustainability Standard of the German Agricultural Society (DLG Sustainability Standard)
The DLG sustainability standard provides analysis and certification for farms and agricultural products within Germany. The assessment covers food, energy crops and livestock production at the farm and plot level. It was conceived in 2005 by the German Agricultural Society, an agricultural section representation, in cooperation with the Technical University of Munich, Martin-Luther University Halle-Wittenberg and the Institute for Sustainable Agriculture Halle [33], with the goal to ensure that agriculture actively promotes a sustainable economy through documentation and communication.
The DLG standard is based on the REPRO environmental and economic management model [34], which consist of a variety of highly complex and interwoven sub-models. The assessment is structured along environmental, economic and social dimensions, which are further divided into a total of 11 sectors of analysis and 25 indicators. Sectors of analysis in the environmental dimension include climate protection, resource input, biodiversity, and soil and water protection. Data on indicators are obtained from three years of field records and financial statements, as well as a questionnaire filled out by the farmer. Indicators are described through a variety of thresholds and regional benchmarks, which are aggregated into a sub-index at the level of each dimension. All three dimensions are then aggregated into an overall sustainability index. A DLG assessment is conducted by a certified auditing agency, which provides a report on farm sustainability and detailed information on each indicator. Certification of the DLG sustainability standard is issued to a farm, providing that specific standards of sustainability are fulfilled.

Criteria for Sustainable Farming (KSNL)
The KSNL assessment provides sustainability analysis and certification for crop, livestock and bioenergy production at the farm level. Its aim is to provide farmers with advice, by identifying deficiencies regarding different SA criteria. Beginning with the criteria for ecologically compatible farming (KUL) module that was developed by the State Institute of Agriculture Thuringa in 1994, KSNL was formally established in 2000, with the addition of the criteria for economically compatible farming (KWL) module and criteria for socially compatible farming (KSL) module [35]. The KUL, KWL, and KSL modules are divided into a total of 12 categories and 37 criterion. Categories within KUL include fertilizer balances, soil protection, pesticide use, biodiversity, and energy balance. Criterion are evaluated and scored according to predetermined tolerance thresholds. The KSNL assessment allows farmers to compare their sustainability performance according to criteria against those of their peers within Germany. If sustainability thresholds are attained, then KSNL provides farm certification.

Farm-Level Assessment Models
MODAM is a static, process-based model that employs multi-objective linear programing to assess farm-level economic and environmental sustainability of crop and livestock production [36,37]. Hosted by the Leibniz Centre for Agricultural Landscape Research (ZALF), MODAM is primarily used for research, to investigate production practices according to associated economic and environmental targets. As such, the model consists of two separate economic and environmental sub-models that are interlinked, so that specific production processes can be evaluated according to trade-offs of impacts on environmental and economic outcomes. The model accounts for metrics on economic performance (e.g., costs, revenues, and gross margins), and ecological indicators (e.g., nitrate leaching, erosion and greenhouse gasses). Using a fuzzy-logic tool, the effects of production practice on selected ecological indicators is assessed and indexed, in relation to site-specific impacts. The model allows for the selection of environmental indicators, or environmental quality targets (EQTs), and the site-specific analysis of impacts on EQTs from different production practices. MODAM is capable of performing scenario analysis, by integrating changes in exogenous variables, such as prices, and therefore is suitable for conducting impact assessments of agri-environmental policies. MODAM has been mainly applied in Europe, in particular Germany [38].

Model for Nitrogen and Carbon Dynamics in Agro-Ecosystems (MONICA)
Primarily used for research, MONICA is a dynamic, process-based model that is mainly used to simulate crop growth under different climate conditions [39]. The model functions to analyze the relationship between crop growth and soil characteristics under different climatic conditions and cropping practices. By building on the nitrogen cycle simulation of the HERMES model [40], MONICA introduces a carbon cycle component to simulate the long-term impacts of soil organic content (SOC) and crop growth under changes in atmospheric CO 2 [41]. Simulations are conducted on the plot level with a resolution of 1 m 2 , and can be extended to farm, as well as regional level impacts assessments. Most applications have been conducted within a European context, such as simulating impacts on SOC of different residue management scenarios [42] and predicting the success of various crop rotations under different climate scenarios [43].

Agricultural Production Systems sIMulator (APSIM)
Developed in the early 1990s by the Agricultural Production Systems Research Unit (APSRU) of the Queensland State Government in Australia, the original purpose of the APSIM modelling framework was to model plant growth under various bio-physical and economic conditions [44]. In recent years, the growing popularity of APSIM has led to an expansion of its modular framework, and to numerous additions to its modelling capabilities, allowing for the modelling of soil, tree, and livestock bio-physical processes under various management practices [45]. APSIM uses dynamic, process-based modelling to incorporate bio-physical feedbacks in its simulations. It has been used in research, to assess farm-level management practices, climate change and climate risk adaptation strategies, agro-forestry strategies, livestock and pasture strategies, and nutrient leaching at field and regional scales [46,47].

Agriculture-Related Ecosystem Services
The concept of ecosystem services (ES) is used to demonstrate the benefit of nature to human well-being [8]. Principally, agricultural ecosystems supply benefits to farmers and society through the provisioning of materials (e.g., food, feed, fiber, and energy). However, agricultural activity is also linked to a wide range of less tangible ES, including pest regulation, maintaining nutrient and hydrological cycles, pollination, erosion, bio-remediation and diversity of genetic resources, as well as scenic beauty and recreation. Within the highly managed environmental context of agricultural management, ES are the product of the coupled interaction between agricultural management and the ecosystems in which they are embedded [10].
In order to assess agricultural ES, it is necessary to categorize and classify them, using a standardized typology. The primary motivation behind standardizing ES is to facilitate the comparison of studies across regional and thematic boundaries, allow for upscaling and deriving synthesis information, as well as to make ES studies policy relevant [6]. Over the years, several typologies have been created, including the Millennium Ecosystem Assessment [8], The Economics of Ecosystems and Biodiversity [48] and the Common International Classification of Ecosystem Services (CICES) [49]. This study utilizes the CICES (V5.1) framework, as it provides the most comprehensive ES classification system to date, represents the state-of-the-art in its field, and is used for ES accounting by the European Environmental Agency [50]. CICES provides a hierarchical and nested structure of sections, divisions, groups, classes and class types. Sections are divided into biotic and abiotic provisioning, regulation and maintenance, and cultural services. This study utilizes these sectional distinctions to organize the structure of the analysis, and identifies ES based on the class level. CICES identifies 83 ES classes in total.
However, because not all CICES classes are germane to agricultural management, a short list of the most relevant ES classes was compiled to facilitate the analysis of this study. Only services that were deemed relevant to arable farming in a European context were considered for the analysis, e.g., services related to marine ecosystems were excluded. The resulting list included 31 ES classes (see Appendix A). As many CICES classes have long and cumbersome names, it was prudent to use abbreviated CICES class names, adopted from Paul et al. [51], to facilitate the analysis.
The short list of 31 agriculture-related CICES classes was then used to conduct a coverage analysis for each SA tool and model. Using descriptions of indicators obtained from the literature, the indicators from each tool and model were compared and matched to ES classes in the short list. Determining whether an indicator could be matched to an ES class was qualitative, and required a degree of interpretation, i.e., direct and explicit linkages between ES and indicators were sometimes difficult to make. Instead, by referencing the content included in the primary literature of the tools and models, as well as scientific publications documenting their usage, it was possible to interpret how some tool and model indicators could be potentially used as proxy values for determining ES coverage. For example, in some cases, matching was straightforward, e.g., the CICES class 'Soil quality by decomposition and fixing processes' (2.2.4.2) could be matched to indicators associated with soil organic matter. In other cases, matching had to be made more indirectly, e.g., CICES classes associated with the provision of biomass, such as 'Cultivated terrestrial plants for nutrition' (1.  Table 3.

Agriculture-Related Sustainable Development Goals
In 2015, the General Assembly of the United Nations adopted the Sustainable Development Goals [11]. As the cornerstone of the 2030 Agenda of Sustainable Development, the 17 SDGs lay out a broad path toward achieving environmental, social and economic sustainability on a global scale. To achieve the SDGs, 169 time-bound targets have been specified. UN members are required to report annually on progress toward attaining these targets [52]. Although agriculture is only explicitly mentioned in SDG 2 Zero Hunger, the majority of the 17 SDGs can be related back to agriculture in some manner [12]. To understand how farming decisions contribute to achieving the SDGs, it is necessary to identify where the thematic scope of farm-level SA tools and models converge with the targets/indicators outlined in the SDGs. To make these connections explicit, this study conducted a coverage analysis to systematically match indicators from SA tools and models to the SDG targets.
Not all 169 SDG targets were relevant to arable agriculture in Europe, therefore a short list of the most agriculturally relevant SDG targets was first compiled. Through expert opinion and reviewing the literature, a short list of 50 agriculture-related SDG targets was formulated (see Appendix B). The SDG target names in the list were abbreviated to facilitate the analysis, however, their original target numbers were retained as reference. Following a similar procedure as in the previous sub-section, indicators of the SA tools and models were then matched and compared to the short list of SDG targets.

Coverage of Ecosystem Services by Tools and Models
The coverage analysis revealed that provisioning services had the most comprehensive coverage across tools and models, e.g. Cultivated terrestrial plants for nutrition (1.  Table 4 gives an overview of results of the matching exercise. Cultural services had the least amount of coverage. Recreation through activities in nature (3.1.1.1) and Recreation through observation of nature (3.1.1.2) were covered in KSNL and DLG, but were absent in all other tools and models. SAFA was the only tool to cover Culture or heritage from interaction with nature (3.1.2.3). Cultural services were not covered by any of the models.

Overview of SDG Coverage
SDG2: Zero hunger, SDG8: Decent work and economic growth, and SDG15: Life on land, had the highest amount of coverage across SA tools and models, which was followed by SDG6: Clean water and sanitation, SDG13: Climate action, SDG12: Responsible consumption and production, and SDG1: End poverty. Some degree of coverage was found for SDG3: Good health and well-being, SDG7: Affordable and clean energy SDG14: Life below water, and SDG16: Peace and justice and strong institution, while none of the tools or models covered SDG9: Industry, innovation and infrastructure. Table 5 provides an overview of the resulting coverage analysis.
In regard to individual SDG targets, targets 2.04 Promote practices that improve land and 8.04 Resource use efficiency, were covered by all SA tools and models. Additionally, targets 1.02 Reduce poverty by half and 2.03 Increase agricultural productivity, were covered by all tools and models with the exception of MONICA. Targets 6.06 Protect water ecosystems, 13.01 Adaptive capacity to climate-related hazards, 14.01 Prevent/reduce marine pollution (nutrient pollution) and 15.02 Protect terrestrial ecosystems, shared a similar degree of coverage across SA tools and models.
Overall, SA tools showed a greater amount of coverage of SDG targets than models. Out of the SA tools, SAFA had the highest coverage of SDGs, followed by RISE, KSNL, and DLG. Of the models, APSIM covered the most targets, followed by MODAM and MONICA.

Discussion
By conducting a coverage analysis that focused on indicators of SA tools and models, it was possible to evaluate their thematic coverage, which revealed substantial differences in their relative potential to cover ES and its contribution toward attaining the SDGs. SA tools (SAFA, RISE, DLG, KSNL) had broader potential coverage of agriculture-related CICES classes in comparison to the farm-level models (MODAM, MONICA, APSIM). Out of all tools and models involved in the study, only SAFA and RISE could be considered comprehensive in terms of both potentially covering ES and attaining the targets outlined in the SDGs.
Overall, SAFA had the broadest potential coverage of ES and SDGs. Although SAFA only mentions ES explicitly in Ecosystem enhancing practice (E 4.1.2) and Structural diversity of ecosystem services (E 4.1.3), the general importance of ecosystems within the context of agriculture management is repeatedly mentioned throughout its manual and Supplementary Material. SAFA's relatively broad coverage of SDG targets can be attributed to the affiliation of the FAO with the UN; and, as SAFA predates the SDGs, it can only be assumed that some of the sustainability criterion outlined in SAFA were used in shaping the SDG targets. RISE showed a similar degree of coverage in terms of ES and SDGs, which can be attributed to recent efforts on behalf of RISE, to harmonize its indicators with those of SAFA [32]. Based on these findings, we conclude that SAFA should continue to be regarded as the standard for farm-level sustainability assessment.
The results suggest that a broad range of indicators, as well as customizable indicator selection, is conducive toward covering a broader range of ES and SDG targets. This was specifically observed in SA tools such as SAFA and RISE. By design, SAFA and RISE are intended to be globally applicable in scope, which means: they must be adaptable to a diverse variety of geographic and normative contexts, hence the necessity to provide a broad set of customizable indicators [53]. This flexibility of indicators allowed SAFA and RISE to cover a wide range of ES and SDGs. On the other hand, it should not be overlooked that customizable indicator selection could unintentionally, or even intentionally, obscure deficiencies in regard to sustainability performance if not based on a proper materiality analysis [54].
A divergence was observed between tools and models, in regard to their potential coverage of ES and SDGs; models generally possessed far fewer indicators and, thus, exhibited a narrower coverage of ES and SDGs. On the basis of the definition of tools and models, as well as the distinction between them used in this paper, models are more limited than tools with regards to the thematic scope, because they are typically developed and applied for answering questions within research and policy that are often highly context-specific. Additionally, due to the level of scientific rigor associated with conducting research, model assessments are limited to indicators whose values have been validated through scientific studies. However, even though models covered fewer ES classes, Merchand et al. [25] argued that tools or models which rely more on quantitative indicators and complex algorithms to capture indicator interaction have a higher amount of credibility, which lends itself toward more accurate portrayals of some ES. On the other hand, sustainability assessment is about identifying trade-offs between competing sustainability targets [55]. A wide range of indicators is therefore a pre-requisite for any model or tool employed for SA.
De Olde et al. [26] pointed out that the comprehensiveness of a tool or model is at odds with its usability. This suggests that if tools and models try to capture too many ES (via the inclusion of more indicators) that conducting the assessment and communicating results to farmers and policy-makers may be too difficult. This should be taken into consideration when developing future farm level SA tools and models.
Even though the ES concept has an environmental bias [56] (which some have claimed is already a problem in current SA [25]), there is still room for better articulating and integrating the ES concept in farm-level tools and models. However, until there is consensus on terminology, i.e., indicators for measuring agricultural ES, it will be difficult to explicitly account for them in farm-level assessments [57]. Additionally, as consensus grows on how to measure agricultural ES, it will become easier to assess how the sustainable management of agricultural ES contributes to broader sustainability objectives, such as the SDGs.
The contribution of local, farm-level decision making to sustainability targets at the global level is a question of high relevance in global assessment studies. However, generalization and the upscaling of local level assessments to global level requires the ability to compare and aggregate across a wide range of local case study results. This, again, is only possible with standardized metrics and indicators [57,58], in particular when novel methods of automated data mining and text analysis are employed [59]. The utilization of the CICES indicator terminology in SA tools and models would be an important step forward in the generalization and upscaling of local assessment outcomes.

Conclusions
This study evaluated current farm-level assessment tools and models, in light of new concepts and principles of sustainability. A coverage analysis was conducted to investigate how adequately ES and the UN SDGs are potentially incorporated by the thematic content (indicators) of a select group of common farm-level tools (SAFA, RISE, DLG, KSNL) and scientific models (MODAM, MONICA APSIM). The results of the study revealed that SAFA outperformed its counterparts in terms of its potential to cover ecosystem services and the SDGs, which suggests that it should continue to be viewed as the standard within the field of farm-level sustainability assessment. This review also found deficiencies in current tools and models, as they do not sufficiently articulate the concept of ecosystem services within their methods. Moving forward, tools and models should be developed that explicitly consider ES and the SDGs. To achieve this, a harmonization of terminology regarding agricultural ES is a pre-requisite. Additionally, SA could benefit from the further standardization of metrics and indicators, as per SAFA and RISE. In doing so, future assessment tools and models will be better equipped to reflect new paradigms of sustainable agriculture.  Acknowledgments: Additional funding and technical support was provided by the Leibniz Centre for Agricultural Landscape Research (ZALF).

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.