Agricultural Sustainability: A Review of Concepts and Methods

: This paper presents a methodological framework for the systematic literature review of agricultural sustainability studies. The framework synthesizes all the available literature review criteria and introduces a two-level analysis facilitating systematization, data mining, and methodology analysis. The framework was implemented for the systematic literature review of 38 crop agricultural sustainability assessment studies at farm-level for the last decade. The investigation of the methodologies used is of particular importance since there are no standards or norms for the sustainability assessment of farming practices. The chronological analysis revealed that the scientiﬁc community’s interest in agricultural sustainability is increasing in the last three years. The most used methods include indicator-based tools, frameworks, and indexes, followed by multicriteria methods. In the reviewed studies, stakeholder participation is proved crucial in the determination of the level of sustainability. It should also be mentioned that combinational use of methodologies is often observed, thus a clear distinction of methodologies is not always possible.


Introduction
The world's population is rapidly increasing and, according to the most recent projections, it is expected to reach 9.8 million in 2050 and 11.2 million in 2100 [1]. To that end, the planet should be ready to cope with the expected rapid population growth. Producing and delivering adequate, high quality food will be one of the most important challenges for humanity in the next century [2]. The evolution of technology has led to intensification of agricultural production leading to increased productivity and (in most of the cases) quality of agriproducts as well. However, this intensification has significantly increased the environmental footprint of agriculture, leading to a number of environmental impacts associated with the extensive use of fertilizers, pesticides, water, changes in land use, etc. [3]. The environmental issues related to agriculture have drawn the attention of the scientific community, which is now turning towards exploring the definition of agricultural sustainability without having yet reached consensus [4,5].
Undoubtingly, defining agricultural sustainability, as with every other sustainability concept, is a challenging task. Nevertheless, it is a common agreement that agricultural sustainability should at least address the three basic pillars of sustainable development by appraising simultaneously environmental, economic, and social issues related to agricultural practices [6]. However, the sustainability assessment of agricultural practices, in general, can be a very challenging task since it involves many case-specific variables to be taken under consideration. Figure 1 presents various processes, inputs, and outputs involved in agricultural production, demonstrating the difficulty and complexity in generalizing the sustainability assessment process. There are general cultivation guidelines and corresponding operations stages for almost all crops (e.g. seeding, irrigation, and harvesting). However, the agronomic practice, the machinery types, the technology level, as well as the quantities and type of materials used may vary, depending on the type of crop, the implementation practice, the country (even the region of the cultivation), and the prevailing climatic conditions. All of the aforementioned parameters affect the cultivation process and the respective inflows and outflows.
It is obvious that the standardization of the Agricultural Sustainability Assessment is a challenging task. Considering the growing interest in assessing the sustainability issues related to agriculture, several tools and methodologies have been developed [7,8]. Among those tools some have gained greater acceptance and are widely used by the majority of practitioners worldwide, such as life cycle assessment (LCA), which is standardized by ISO in ISO 14040:2006 and ISO 14044:2006 [9]. In addition, many indicator-based methods have been developed for the sustainability assessment of agricultural practices that use different approaches with regards to the overall objective, the intended users, and the definition of agricultural sustainability they employ [4].
Sustainability 2019, 11, x FOR PEER REVIEW 2 of 27 sustainability assessment of agricultural practices, in general, can be a very challenging task since it involves many case-specific variables to be taken under consideration. Figure 1 presents various processes, inputs, and outputs involved in agricultural production, demonstrating the difficulty and complexity in generalizing the sustainability assessment process. There are general cultivation guidelines and corresponding operations stages for almost all crops (e.g. seeding, irrigation, and harvesting). However, the agronomic practice, the machinery types, the technology level, as well as the quantities and type of materials used may vary, depending on the type of crop, the implementation practice, the country (even the region of the cultivation), and the prevailing climatic conditions. All of the aforementioned parameters affect the cultivation process and the respective inflows and outflows.
It is obvious that the standardization of the Agricultural Sustainability Assessment is a challenging task. Considering the growing interest in assessing the sustainability issues related to agriculture, several tools and methodologies have been developed [7,8]. Among those tools some have gained greater acceptance and are widely used by the majority of practitioners worldwide, such as life cycle assessment (LCA), which is standardized by ISO in ISO 14040:2006 and ISO 14044:2006 [9]. In addition, many indicator-based methods have been developed for the sustainability assessment of agricultural practices that use different approaches with regards to the overall objective, the intended users, and the definition of agricultural sustainability they employ [4]. Considering what was mentioned above and that there is not yet an established standardized methodology, it is very important for anyone attempting to assess agricultural sustainability to have an overview of the available and most usually used methodologies and tools to that scope. As a result, there is a need for a methodological framework that will help practitioners to evaluate the existing available tools and methods in order to select the appropriate one, for each specific task.
To that end, the present paper has a two-fold objective: • To determine the evaluation criteria to systematically review agricultural sustainability assessment studies. To that end, several review papers were selected based on specific selection criteria and examined to determine the goal as well as the individual evaluation criteria adopted in each review. The ultimate goal is to critically synthesize a methodological framework for the systematic recording and evaluation of available agricultural sustainability assessment studies. Such systematic documentation can facilitate the comparison among the available studies as well as the development of a standard methodological framework for the sustainability assessment of agriculture. • To implement the proposed methodology by investigating the available and mostly used methodologies to assess the sustainability of crop cultivations at the farm level. Considering what was mentioned above and that there is not yet an established standardized methodology, it is very important for anyone attempting to assess agricultural sustainability to have an overview of the available and most usually used methodologies and tools to that scope. As a result, there is a need for a methodological framework that will help practitioners to evaluate the existing available tools and methods in order to select the appropriate one, for each specific task.
To that end, the present paper has a two-fold objective: • To determine the evaluation criteria to systematically review agricultural sustainability assessment studies. To that end, several review papers were selected based on specific selection criteria and examined to determine the goal as well as the individual evaluation criteria adopted in each review. The ultimate goal is to critically synthesize a methodological framework for the systematic recording and evaluation of available agricultural sustainability assessment studies. Such systematic documentation can facilitate the comparison among the available studies as well as the development of a standard methodological framework for the sustainability assessment of agriculture.

•
To implement the proposed methodology by investigating the available and mostly used methodologies to assess the sustainability of crop cultivations at the farm level. The methodological The evaluation process implemented to assess and select the criteria needed for the methodological framework of the systematic review on agricultural sustainability studies is presented in Figure 2. Initially, scientific literature published in Science Direct and Scopus was searched using the specific keywords and Boolean operators (AND/OR). The keywords were selected with respect to the integrated concept of "sustainability assessment", as well as the individual processes it consists of, namely, "environmental assessment", "economic assessment", and "societal assessment" (or "social assessment") combined with the keywords agriculture/farming using the Boolean Operator AND to exclude results that are not relevant to the field under examination. It should be added that the concept of "agricultural sustainability" was also included in the search.
The first sample of scientific papers that resulted from the initial search included 55 papers from peer-reviewed scientific journals. These papers were put through a screening process considering specific exclusion criteria presented in Figure 2. Specifically, studies that were not related to agriculture and especially focused on alternative agricultural processes were excluded. As a result, papers exclusively focused on aquaculture or organic farming studies, biofuels and biorefinery as well as review of studies comparing agronomic protocols were excluded from the present assessment. Additionally, review studies regarding soil quality, land management, food processing systems and discussions that did not specifically define the methods of the review conducted, were excluded. At this point, it should also be stated that in the context of agricultural sustainability studies, livestock farming was included in the search. The methodological framework is applied to 38 Agricultural Sustainability studies published in peer-reviewed journals in the last decade (2009-2018).

Research Design
The evaluation process implemented to assess and select the criteria needed for the methodological framework of the systematic review on agricultural sustainability studies is presented in Figure 2. Initially, scientific literature published in Science Direct and Scopus was searched using the specific keywords and Boolean operators (AND/OR). The keywords were selected with respect to the integrated concept of "sustainability assessment", as well as the individual processes it consists of, namely, "environmental assessment", "economic assessment", and "societal assessment" (or "social assessment") combined with the keywords agriculture/farming using the Boolean Operator AND to exclude results that are not relevant to the field under examination. It should be added that the concept of "agricultural sustainability" was also included in the search.
The first sample of scientific papers that resulted from the initial search included 55 papers from peer-reviewed scientific journals. These papers were put through a screening process considering specific exclusion criteria presented in Figure 2. Specifically, studies that were not related to agriculture and especially focused on alternative agricultural processes were excluded. As a result, papers exclusively focused on aquaculture or organic farming studies, biofuels and biorefinery as well as review of studies comparing agronomic protocols were excluded from the present assessment. Additionally, review studies regarding soil quality, land management, food processing systems and discussions that did not specifically define the methods of the review conducted, were excluded. At this point, it should also be stated that in the context of agricultural sustainability studies, livestock farming was included in the search. The final paper collection comprises 16 review papers or studies that assess agricultural sustainability. It should be noted that the literature is relatively scarce regarding studies that consider all the three dimensions of sustainability with respect to other scientific fields, for example, the secondary production of goods. To that end, the sample includes studies considering the environmental aspect of agricultural sustainability which is the most often studied. The sample was then assessed in two ways, a systematic and critical [10]. The systematic way concerns the listing of The final paper collection comprises 16 review papers or studies that assess agricultural sustainability. It should be noted that the literature is relatively scarce regarding studies that consider all the three dimensions of sustainability with respect to other scientific fields, for example, the secondary production of goods. To that end, the sample includes studies considering the environmental aspect of agricultural sustainability which is the most often studied. The sample was then assessed in two ways, a systematic and critical [10]. The systematic way concerns the listing of the papers based on specifically defined criteria [11]. The initial listing criteria in the case of the presented framework, include the title and author of the paper, the year of publication as well as the spatial coverage of the study (Global or Regional) and the type of review (Critical or Systematic).
Critical reviews are thorough literature works that attempt to evaluate and assess the basic aspects or inputs and document the differences in methodology and implementation of scientific studies on a specific field [11]. In this case, the critical evaluation of the sample concerns the individual analysis of the selected studies with the purpose of extracting the individual evaluation criteria used in each study. The individual criteria with similar context were aggregated in a general table of criteria. Then, each paper was systematically reviewed as to whether each criterion was included in the review.
The resulting table is a comprehensive overview of the issues most frequently examined in a review study. The criteria that were used the most are the criteria that should be integrated in the methodological framework for the systematic review of agricultural sustainability studies. The rule followed in the present paper was to exclude criteria that were used in less than four papers. Following next is the sample presentation as well as the criteria frequency table along with a critical assessment of the sample used for the evaluation.

Systematic Approach
The 16 review papers that were extracted by the implementation of the first steps of the methodology, presented in the previous section are presented in Table 1 along with their classification with respect to their type and spatial coverage. As presented in Figure 3, during 2016-2017, the number of review papers has increased, indicating a boosted interest in the sustainability of agricultural practices. Payraudeau et al. (2005) first analyzed and systematically reviewed six (6) agricultural sustainability methods employed in eleven (11) case studies, indicating the variety of objectives, target groups, and methodologies used [20]. Bockstaller et al. (2008), followed by presenting a typology of indicators and the evolution of the methods used for their advancement [19], in 2009, critically evaluateing four (4) comparative studies to analyze the methods of the comparison, highlighting their main results [23]. Also focusing on indicators, Binder et al. (2010) presented an evaluation review framework that was used to review agricultural sustainability methods [4]. The framework assessed the normative, systematic and procedural aspects of the methods under evaluation.  Regarding the types of review papers and their classification to systematic or critical according to the definitions presented in the previous section [11], it is observed that, in principal, both categories are equally preferred by the researchers. However, in some cases, the distinction is not clear or a systematic and critical review is performed at the same time. Such example is the work of De Luca et al. (2017), where authors performed a critical and systematic review to determine, among other issues, which Multi Criteria Decision Analysis (MCDA) and participatory methods have been used along with LCA tools and the type of integration used in each case [10]. Also, Baldini et al. (2017) critically reviewed forty-four (44) LCA studies on milk production and systematically compared their methods and results to highlight issues requiring further discussion and investigation [15].
Considering the selected samples, it can be stated that in most cases systematic reviews are used in order to compare methodologies and results regarding a specific field of agricultural application. conducted a chronological review of LCA studies in pig production, attempting to demonstrate how LCA has captured technological advancements in the field as well as the methodological issues observed [17].
On the contrary, the majority of the reviews that were characterized as critical are dealing with the evaluation of indicator-based methods or the classification of agricultural sustainability indicators, such as the work of Acosta-Alba et al.  Regarding the types of review papers and their classification to systematic or critical according to the definitions presented in the previous section [11], it is observed that, in principal, both categories are equally preferred by the researchers. However, in some cases, the distinction is not clear or a systematic and critical review is performed at the same time. Such example is the work of De Luca et al. (2017), where authors performed a critical and systematic review to determine, among other issues, which Multi Criteria Decision Analysis (MCDA) and participatory methods have been used along with LCA tools and the type of integration used in each case [10]. Also, Baldini et al. (2017) critically reviewed forty-four (44) LCA studies on milk production and systematically compared their methods and results to highlight issues requiring further discussion and investigation [15].
Considering the selected samples, it can be stated that in most cases systematic reviews are used in order to compare methodologies and results regarding a specific field of agricultural application. conducted a chronological review of LCA studies in pig production, attempting to demonstrate how LCA has captured technological advancements in the field as well as the methodological issues observed [17].
On the contrary, the majority of the reviews that were characterized as critical are dealing with the evaluation of indicator-based methods or the classification of agricultural sustainability indicators, such as the work of Acosta-Alba et al. (2011), who reviewed eight (8) agricultural sustainability frameworks that use reference values for their indicators and analyzed the methods for the establishment of the reference values and investigating ways for their improvement [14]. Latruffe et al. (2016) provided a review of the available agricultural sustainability indicators, highlighting the relative high increase of environmental indicators as compared with the smaller interest in economic and social indicators [18]. Finally, Lebacq et al. (2013) reviewed the types of sustainability indicators and proposed indicative ground rules for the selection of agricultural sustainability indicators [22].
With respect to the spatial coverage of the reviews (Figure 4), the majority deals with studies from all around the world. Nevertheless, there are reviews assessing studies in specific countries or regions. For example, Roy et al. (2012), based on a systematic review and synthesis, presents a set of indicators that could be used to assess agricultural sustainability in Bangladesh, highlighting the need for integrated approaches and participatory processes during agricultural sustainability assessment [13]. Additionally, Morais et al. (2016) systematically reviewed twenty-two (22) agri-food-dedicated LCA studies in Portugal, revealing issues regarding the challenges faced and the lack of systematic regional approach in the country that could safeguard the accuracy and comparability of the results [16]. Lastly, Yan et al. (2011) reviewed thirteen (13) LCA studies on European milk production, indicating that direct comparison is challenging due to inconsistency regarding the used methodologies [9].  With respect to the spatial coverage of the reviews (Figure 4), the majority deals with studies from all around the world. Nevertheless, there are reviews assessing studies in specific countries or regions. For example, Roy et al. (2012), based on a systematic review and synthesis, presents a set of indicators that could be used to assess agricultural sustainability in Bangladesh, highlighting the need for integrated approaches and participatory processes during agricultural sustainability assessment [13]. Additionally, Morais et al. (2016) systematically reviewed twenty-two (22) agri-food-dedicated LCA studies in Portugal, revealing issues regarding the challenges faced and the lack of systematic regional approach in the country that could safeguard the accuracy and comparability of the results [16]. Lastly, Yan et al. (2011) reviewed thirteen (13) LCA studies on European milk production, indicating that direct comparison is challenging due to inconsistency regarding the used methodologies [9].

Critical Approach
The selected sample, which was thoroughly described in the previous section, was screened, to extract the individual evaluation criteria used during each review. As some criteria had the same objective or were of the same context they were categorized accordingly. Also, some studies further analyzed the criteria including various subcriteria, but this is out of the scope of this paper since it is an issue related to the scrutiny of the review each author aims to achieve and the corresponding scope. Figure 5 presents the criteria identified during the screening process and the frequency of their occurrence. A total of forty-four (44) different criteria were used in the sixteen (16) studies reviewed. The review criteria frequency table is presented in detail in Appendix A (Table A1). The first six criteria (beginning from the top of Figure 5) were common in most of the reviews examined and include the name and description of the assessment method or tool, the field of application, the country of application, and the year of issuing. The literature typology concerns the type of the document reviewed.

Critical Approach
The selected sample, which was thoroughly described in the previous section, was screened, to extract the individual evaluation criteria used during each review. As some criteria had the same objective or were of the same context they were categorized accordingly. Also, some studies further analyzed the criteria including various subcriteria, but this is out of the scope of this paper since it is an issue related to the scrutiny of the review each author aims to achieve and the corresponding scope. Figure 5 presents the criteria identified during the screening process and the frequency of their occurrence. A total of forty-four (44) different criteria were used in the sixteen (16) studies reviewed. The review criteria frequency table is presented in detail in Appendix A (Table A1). The first six criteria (beginning from the top of Figure 5) were common in most of the reviews examined and include the name and description of the assessment method or tool, the field of application, the country of application, and the year of issuing. The literature typology concerns the type of the document reviewed. For example, De Luca et al. (2017) classified the selected publications into three categories (Journal Article, Book Chapter, and Conference Proceedings paper) [10]. Baldini et al. (2017), on the other hand, refers to publication types classifying the sample according to whether the literature is an original article, a review, a research direction, or a scenario analysis [15].   [7,15], following a cradle-to-gate or cradle-to-market approach, whereas de Vries et al. (2015) reviewed studies at least from cradle-to-farm gate [21]. Lastly, Peter et al. (2017) examined both the level of assessment (global, regional, etc.) and the system boundaries (farm-gate or farm-gate-grave) of the studies they review [12].
The issue of the intended user of a method or tool is being considered in several of the studies reviewed. Binder et al. (2010) identified the target group of the examined methodologies [4], whereas de Luca et al. (2017) referred to the specific criterion as actors involved in the assessment process (i.e., local experts, scientists, workers, etc.) [10]. Bockstaller et al. (2008) classified the reviewed works according to the target user of the method reviewed, i.e. decision-maker, researcher, technician or farmer [19]. Considering the type and the accessibility of data criteria, Baldini et al. (2017) distinguish the data in experimental and model data [15]. The accessibility of data (or availability as expressed by Roy et al. 2012 [13]) is examined by Bockstaller et al. (2008) for three user groups, farmers, advisors, and administration [19].
With reference to the name and type of the indicators reviewed, many approaches were identified during the screening process. Lebacq [10].
Based on the rule set in the methodology section, the red line in Figure 5 presents the criteria exclusion threshold. Only criteria identified more than four times in the sample reviewed are included in the methodological approach for the systematic review of agricultural sustainability studies. A total of eighteen (18) criteria surpassed the exclusion threshold. These criteria are classified in groups with respect to their context and are presented in the subsequent section.

Methodological Framework Presentation
Following the criteria determination process described in the previous sections, Figure 6 presents the critical synthesis to systematically review agricultural sustainability related studies. The proposed methodological framework is based on a series of criteria and divided into five (5) underlying categories. The first two categories refer to the initial screening stage. During this preliminary stage, the studies are assessed to determine if the study will be included in the sample on the basis of the case-specific exclusion criteria determined with regards to the scope of the review.
The initial screening stage includes two categories (i.e., "method identification" and "general information") of criteria with respect to the basic description of each study. The general information of a study concerns the year of publication and the type of literature which can be journal article, conference proceedings paper, book chapter, technical report, etc., and the country that the study was conducted. The method identification category includes criteria that deal with the assessment method developed or employed. Therefore, the criterion description of the assessment tool describes the method or tool presented based on whether it is a presentation of a new methodology, the application of an existing method or tool or a combination namely a new methodology that is implemented with an application example. The last criterion is the level of the assessment performed, i.e., global, national, regional, or farm level, according to the approach introduced by Gomez-Limon et al. (2010) [24]. After the initial assessment and finalization, for the sample to be reviewed, phase is completed; the in-depth review stage follows. For this stage, three (3) categories of criteria have been defined. The first category of criteria assesses the scope of the studies reviewed. The first criterion is the identification of the goal (or objective) of each assessment, so as it is feasible to perform comparative reviews among studies with the same objective. For that purpose, following the definition of Gaviglio et al. (2017), the papers are classified according to whether a method is "goal prescribing or "system describing" [25]. Other criteria proposed concern the determination of the target user, as well as the functional unit and the time dimension of the assessment.
The second category refers to the identification of impacts starting with the definition of the sustainability dimension examined in each study, continuing with the documentation of the impacts considered during the assessment expressed in indicators (name and type). The last category concerns the data and the calculation methods used for the assessment. The criterion type of data examines whether the data used are model or experimental. Furthermore, to examine the accessibility of data, the present study refers to the definition of Angevin et al. (2017) [26]. Therefore, depending on the data used, the assessment can be characterized as ex ante (indicating expectation and uncertainty) when focusing on assessing a new scenario or as ex post (indicating processing actual field data) when examining a current situation [26]. Additionally, for each study reviewed, the validation and aggregation methods should be examined too.
The proposed methodological framework aims at facilitating the comparison among studies in order to capture the research advancements and current practices in the field under examination. This is an issue of particular importance since the assessment of agricultural sustainability is not a standardized process and entails a plethora of different methods, tools, and frameworks that assess a large number of different indicators that represent an analogously large number of different impacts. Prior to designing any assessment model, an exhaustive review is mandatory to safeguard consistency and relevance with other works. Also, the systematic documentation of the advancements in the field is the only way to begin constructing a unified, commonly accepted methodology for agricultural sustainability assessment.

Search Scheme
The methodology presented above was used to investigate the available and mostly used methodologies to assess the sustainability of crop cultivations at the farm level. The review begins with the collection of the initial sample of papers by searching within the most acknowledged databases and more specifically, Scopus and Science Direct. The search scheme is based on specific keywords and their combination as presented in Table 2, and the use of Boolean operators (OR and The first category of criteria assesses the scope of the studies reviewed. The first criterion is the identification of the goal (or objective) of each assessment, so as it is feasible to perform comparative reviews among studies with the same objective. For that purpose, following the definition of Gaviglio et al. (2017), the papers are classified according to whether a method is "goal prescribing or "system describing" [25]. Other criteria proposed concern the determination of the target user, as well as the functional unit and the time dimension of the assessment.
The second category refers to the identification of impacts starting with the definition of the sustainability dimension examined in each study, continuing with the documentation of the impacts considered during the assessment expressed in indicators (name and type). The last category concerns the data and the calculation methods used for the assessment. The criterion type of data examines whether the data used are model or experimental. Furthermore, to examine the accessibility of data, the present study refers to the definition of Angevin et al. (2017) [26]. Therefore, depending on the data used, the assessment can be characterized as ex ante (indicating expectation and uncertainty) when focusing on assessing a new scenario or as ex post (indicating processing actual field data) when examining a current situation [26]. Additionally, for each study reviewed, the validation and aggregation methods should be examined too.
The proposed methodological framework aims at facilitating the comparison among studies in order to capture the research advancements and current practices in the field under examination. This is an issue of particular importance since the assessment of agricultural sustainability is not a standardized process and entails a plethora of different methods, tools, and frameworks that assess a large number of different indicators that represent an analogously large number of different impacts. Prior to designing any assessment model, an exhaustive review is mandatory to safeguard consistency and relevance with other works. Also, the systematic documentation of the advancements in the field is the only way to begin constructing a unified, commonly accepted methodology for agricultural sustainability assessment.

Search Scheme
The methodology presented above was used to investigate the available and mostly used methodologies to assess the sustainability of crop cultivations at the farm level. The review begins with the collection of the initial sample of papers by searching within the most acknowledged databases and more specifically, Scopus and Science Direct. The search scheme is based on specific keywords and their combination as presented in Table 2, and the use of Boolean operators (OR and AND) to increase the efficiency of the search. The initial search resulted in 959 papers containing the keywords searched. The initial sample was then screened based on the inclusion/exclusion criteria of Table 2. This secondary assessment resulted in 387 papers which where, then reviewed against the initial screening criteria (Figure 6). As the purpose of this review is to examine studies assessing crop agricultural sustainability at the farm level, the 387-paper sample was filtered to select the peer-reviewed journal articles that fulfilled the following criteria. (a) Examine all three pillars of sustainability (environmental, economic, and social).

Initial Screening
As presented in the previous section 387 papers were reviewed in the initial screening stage. The filtering of the reviewed sample according to the scope of the review under study, resulted in 38 peer-reviewed journal articles. This section presents the initial systematic review of the 38-paper sample with the use of descriptive statistics to gain further insight about the general information that derive from the reviewed sample. With respect to the general information, the majority of papers (21%) were issued in 2017, whereas only two papers (5%) fitting the review criteria ware published in 2012, 2011, and 2010 [27]. However, it is worth noting that 45% of the examined papers was issued during the last three years (2016-2018), indicating a boost in the scientific community's interest regarding integrated sustainability assessment (Figure 7).
Regarding the geographical origination, as presented in Figure 7, half of the assessments were performed in Europe (50%), whereas 16% were performed in Asia. Additionally, only three out of 38 assessments were performed in North America. With respect to the literature typology of the studies reviewed, as it was mentioned before only peer-reviewed journal articles were included in the reviewed sample. Regarding the method identification category, Table 3 presents all the methods and tools that were identified during the review process (the nomenclature is presented in Appendix A). All of the relevant methods will be presented in detail later. In the majority of the papers examined (66%), the methods or tools presented are also practically tested presenting the relevant examples (case studies). In 18% of the papers, an already existing methodology was applied and presented while 16% of papers presented a methodology without testing it in practice. Continuing with the level of assessment, in 79% of the works examined, the assessment was performed exclusively for the farm level, whereas for 21% of the works, the level of assessment was also broadened beyond the farm level by examining local, regional, or national sustainability. The most frequently examined crop is maize and wheat (examined in five cases each), followed by olive, spinach and rice (examined in two cases studies each). The other crops examined in the papers reviewed included legumes, lettuce, scallions, red radish, banana, soybean, grapes, cranberry, potato, and coffee. Additionally, different agronomic practices are examined as for example organic farms [28], greenhouse cultivations [29], and school gardens [30].  Regarding the method identification category, Table 3 presents all the methods and tools that were identified during the review process (the nomenclature is presented in Appendix A). All of the relevant methods will be presented in detail later. In the majority of the papers examined (66%), the methods or tools presented are also practically tested presenting the relevant examples (case studies). In 18% of the papers, an already existing methodology was applied and presented while 16% of papers presented a methodology without testing it in practice. Continuing with the level of assessment, in 79% of the works examined, the assessment was performed exclusively for the farm level, whereas for 21% of the works, the level of assessment was also broadened beyond the farm level by examining local, regional, or national sustainability. The most frequently examined crop is maize and wheat (examined in five cases each), followed by olive, spinach and rice (examined in two cases studies each). The other crops examined in the papers reviewed included legumes, lettuce, scallions, red radish, banana, soybean, grapes, cranberry, potato, and coffee. Additionally, different agronomic practices are examined as for example organic farms [28], greenhouse cultivations [29], and school gardens [30].

In-Depth Review
This section presents the systematic review results against the in-depth review criteria initializing the presentation with the scope criteria category (Tables A2 and A3 of Appendix A). Regarding the goal of the assessment, 61% of the examined studies are system describing, whereas the other 40% attempts to identify and evaluate policies and techniques that could be used to improve agricultural sustainability performance. Regarding the target users of the methodologies proposed, the majority of the examined works is aimed at decision-makers, farmers, and researchers. More specifically, 40% of the studies identify decision-makers as their target users, whereas 26% aim at farmers and 21% aim at researchers. Continuing, only three (3) 2015) preferred functional units related to the weight of the final product ("kg of un-/packed fresh product at the point of sale-POS" and "1 tn fresh weight standardized to 86% dry matter, respectively") [36,43].
Concerning the criterion of the time dimension, in several studies the assessment was performed for a single year period [25,28,29,33,36,41,51,60]. However, there are also studies that perform the assessment for a range of years. Snapp [43].
Regarding the Impact Identification category, as described above, the research scope contains only studies that attempt to examine all the three dimensions of sustainability, namely, the environmental, economic, as well as social pillar, contributing towards an integrated sustainability assessment evaluation. During the extensive review, all of the individual impacts-expressed as indicators-that were examined within the reviewed studies were extracted and documented. However, further thorough classification and commenting on the individual indicators used goes beyond the limits of this analysis and has already been investigated in several review studies in the past [4,13,18,19,22].
With respect to the data calculation method category of criteria, for 82% of the papers examined a validation process is not mentioned. Only 18% of the papers describe a validation process for the proposed methodologies. On the other hand, 74% of the studies mention the use of an aggregation technique or methodology aiming at the simplification and the generalization of the results. Regarding the type of data used for the assessments performed (Figure 8), the majority uses experimental data (68%), whereas a small percentage of works (18.4%) employ only model data for the sustainability assessment. Accordingly, 58% are ex post assessments attempting to evaluate current practices; whereas, in 31.6% of the papers, the evaluation of prediction scenarios is attempted.
for a single year period [25,28,29,33,36,41,51,60]. However, there are also studies that perform the assessment for a range of years. Snapp  Regarding the Impact Identification category, as described above, the research scope contains only studies that attempt to examine all the three dimensions of sustainability, namely, the environmental, economic, as well as social pillar, contributing towards an integrated sustainability assessment evaluation. During the extensive review, all of the individual impacts-expressed as indicators-that were examined within the reviewed studies were extracted and documented. However, further thorough classification and commenting on the individual indicators used goes beyond the limits of this analysis and has already been investigated in several review studies in the past [4,13,18,19,22].
With respect to the data calculation method category of criteria, for 82% of the papers examined a validation process is not mentioned. Only 18% of the papers describe a validation process for the proposed methodologies. On the other hand, 74% of the studies mention the use of an aggregation technique or methodology aiming at the simplification and the generalization of the results. Regarding the type of data used for the assessments performed (Figure 8

Agricultural Sustainability Methods and Tools
In the previous sections a descriptive qualitative analysis of the review criteria was presented. The aim was to examine the research trend of crop agricultural sustainability and specifically the trend of the criteria concerning the scope and the calculation methods used. In this section, the methodologies and tools, extracted as a result of the review conducted, are presented. Figure 9 demonstrates the methods and tools identified and the corresponding frequency of occurrence. These methods and tools were classified in five major categories based on the main scope of the assessment (as expressed by the authors), underlining the fact that the categories selected may overlap as part of the overall concept. A distinctive example is MCDA which is used to facilitate the assessment of multivariate problems that are expressed with indicators. Nevertheless, the scope of studies employing MCDA methods focus on the aggregation of the results while methods proposing indicator sets and indexes focus on determining the criteria of the assessment. Another example is the carbon footprint (CF) which is an indicator that is often met in Indicators sets and frameworks. Nevertheless, it is a very commonly used standalone methodology for environmental impact assessment.
To that end, LCA methods relate to the life cycle of the examined element. Environmental methods relate to the quantification of the environmental impact of the examined element, and economic methods refer to the use of financial methods in the impact assessment. Multicriteria methods are methods that employ multicriteria assessment for the evaluation of agricultural sustainability, and Indicator methods include indicator sets and frameworks for the assessment of agricultural sustainability. With respect to the individual methodologies that were identified, the term "indicators" refers to all those methodologies that were not given a specific name by their developers. economic methods refer to the use of financial methods in the impact assessment. Multicriteria methods are methods that employ multicriteria assessment for the evaluation of agricultural sustainability, and Indicator methods include indicator sets and frameworks for the assessment of agricultural sustainability. With respect to the individual methodologies that were identified, the term "indicators" refers to all those methodologies that were not given a specific name by their developers.   [31]. For the assessment, authors combined a series of tools to evaluate the three pillars of sustainability, namely, LCA for the environmental pillar, LCC for the economic, and SLCA for the societal pillar. They integrated their results by employing the AHP method for multicriteria analysis [31]. From the economic methods category, Van Passel et al. (2009) proposed a methodological framework based on the sustainable value approach (SVA) to assess the sustainability on farm production level [58]. Van Passel et al. employed the SVA method attempting to correlate farm performance in respect to consumption of resources. The work represents a benchmarking approach since it does not focus on the evaluation of sustainability in absolute terms, but it assesses the performance compared to standards [58]. Van Passel et al. (2011) stated that to perform multilevel and multi-user assessments, a combination of methodologies can offer more advantages than integrated methodologies [53]. To that end, the SVA method was combined with the MOTIFS indicator tool. According to Van Passel et al. (2011), MOTIFS is a visual monitoring tool used for the aggregation of indicators of various themes, which creates benchmarks for the rescaling of the indicator values [53].

Multicriteria Assessment Methods and Tools
Within the multicriteria assessment methods that are used for assessing agricultural sustainability, the works examined can be classified into groups that employ and develop the same methodological framework. Such groups are the studies that use the MASC decision model developed by Sadok et al. (2009), which was built as part of the decision support system DEXi [59]. The MASC model is a hierarchical multiattribute decision support model designed for the ex ante assessment of cropping systems to address the need of in-field alternative scenario evaluation. Such models allow for the simplification of the decision problem by downscaling it to smaller and less complex problems expressed by designated variables [59]. The DEX methodology performs aggregation of qualitative attributes and utility functions using "IF-THEN" aggregation rules [59]. Colomb [26]. The IPM-based systems were designed and tested in nine (9) locations in Europe [38]. They compared the sustainability of the examined systems, discussing the benefits or drawbacks of the IPM systems. Vasileiadis et al. (2017) also adopted methodologies from the environmental and economic categories. Economic data, with the use of a template, were collected from participants to perform cost-benefit analysis (CBA). Furthermore, an environmental risk assessment was performed by implementing the SYNOPS-WEB Tool [38]. Lastly, Chopin et al. (2017) adapted the MASC model in order to ex ante assess the sustainability in the area of local banana farming systems [37].
Multicriteria methods facilitate decision making while considering multiple variables, and such methods use weighting techniques in order to produce composite indices [24]. Among the studies examined, the most frequently used methods are the principal component analysis ( [40,46]. Concluding with the multicriteria method category, Siciliano et al. (2009) used the social multicriteria evaluation (SMCE) framework, which was implemented through the NAIADE (novel approach to imprecise assessment and decision environments) software, to assess the sustainability of farming practices in a small rural area in Italy [60]. Egea et al. (2016) employed the analytic hierarchy process (AHP) in order to investigate the combination of protected destination of origin oil production system that leads to optimal sustainability [39]. Bockstaller et al. (2017) introduced the CONTRA tool, an innovative aggregation method that leads to the creation of decision trees using fuzzy sets [34]. Peano et al. (2014) proposed a multicriteria methodology to evaluate the effectiveness of the slow food presidia, which are organized structures aiming at the preservation of quality production at risk to extinction by following specific guidelines and protocols for each product category [48].

Indicator Sets, Indexed and Frameworks
This category of methods and tools contains indicator sets, indexes, and frameworks that were used in the reviewed works to assess agricultural sustainability at the farm level. Walter et al. (2009a;2009b) proposes a new indicator-based method to assess the unsustainability of a system rather than its sustainability [62]. Their method borrows elements of the LCA methodology and was implemented in two stages. The first stage includes the creation of an issue inventory and its contextualization, while the second stage includes the standardization and sustainability valuation process [61,62]. Rodriguez et al. (2010) proposed the APOIA-NovoRural framework, which comprises a collection of basic and composite indicators covering five dimensions of sustainability: landscape ecology, environmental quality, sociocultural values, economic values, and management and administration [56]. Sharma et al. (2011) introduced a methodology based on questionnaires and surveys and composed an agricultural sustainability index (ASI) targeted to Bihar province (India) [55], and also calculated the sustainability parameters for a 60-year period. Sami et al. (2013) selected six indicators that were considered appropriate to assess sustainability in a regional context. Additionally, in order to evaluate some of these indicators they used a selection of fuzzy submodels [52]. Van Asselt et al. (2014) propose a protocol for the collection and evaluation of indicators for the sustainability assessment of agri-food production systems [49]. Their proposed list covers a wide range of indicators related to the three pillars of sustainability, aiming at supporting policy makers in decision making by choosing the most relevant indicators. Yegbemey et al. (2014), proposed an innovative participatory approach that resulted in seventeen (17) indicators. All relevant data were collected through a household survey. The sustainability was evaluated with relative scores while the total sustainability level was based on the average scores of the individual indicators [47]. Peano et al. (2015), proposed the SAEMETH monitoring tool based on a set of qualitative indicators. The selection of the indicators was based on the criteria introduced by Meul et al. (2008) [63] and, for their evaluation, they set a minimum and maximum threshold based on reference values that was derived from best practices or through surveys [45]. Santiago-Brown et al. (2015) presented the process for selecting indicators to assess viticulture production sustainability. For the selection of the indicators, the adapted nominal group technique was used. The selected indicators were reduced according to their relevance [44] resulting in seventy-six (76) indicators hierarchized based on their importance. Allahyari et al. (2016) selected five-hundred-and-eighty-eight (588) indicators through an extensive literature review. Following erasing duplicates and prioritizing the sample, it resulted in 62 indicators, which were used in an extensive survey among experts. The indicators were assessed based on their importance while the resulting data were assessed with the Minskowski fuzzy screening method [42]. Sajjad et al. (2016) examined the relevant agricultural sustainability at farm and regional scale using the sustainable livelihood security index (SLSI) [41]. Yang et al. (2016) assessed the sustainability of greenhouse vegetables using indicators. More specifically, to examine the greenhouse vegetable farming practices and the economic and social management conditions, they used rapid and participatory rural appraisal (RRA/PRA) tools combined with data derived from in-field measurements and parallel surveys [29]. In 2016, de Olde et al. proposed the sustainability assessment tool named response-inducing sustainability evaluation (RISE), which was implemented for the evaluation of organic farms in Denmark. The tool contains indicators for a total of 10 themes and 51 subthemes. The indicators were normalized and aggregated and each theme was evaluated based on the average score of the relevant subthemes [28]. Goswami et al. (2017) integrated the sustainable livelihood (SL) and the drivers-pressures-stateimpact-response (DPSIR) framework, proposing a small farm sustainability index (SFSI) that could address the complexity of small-holder family farms under a participatory approach [35]. The proposed framework assesses sustainability in multiple levels assigning the relevant weights and resulting in the creation of an aggregated index for the entire system. They indicate that the introduction ICT technologies in agriculture (web-based platforms, wireless sensors, etc.) can facilitate data sharing among stakeholders and provide the basis for assessing the sustainability of farming systems. Recanati et al. (2017) proposed an indicator-based framework for the assessment of sustainability of small-scale farming systems in water-limited regions. They implemented the framework by modeling an "average" farm based on a survey among 30 farmers [33]. Gaviglio et al. (2017), attempting to integrate various analytical techniques, introduced the 4AGRO tool, which is an online self-assessment tool based on indicators. It consists of 42 subindicators that are divided in 15 complex indicators, five for each pillar of sustainability [25]. The tool was demonstrated in an agricultural park in Italy. Finally, Snapp et al. (2018) proposed a methodology based on indicators that derived through a participatory approach involving a steering committee with multidisciplinary participants from eight (8) institutions [32]. The indicators were normalized based on max possible values.

Conclusions
To meet the ever-increasing interest towards agricultural sustainability, many methodologies and tools emerge, introducing integrated and holistic assessment approaches. However, there is still no consensus on the standardization of agricultural sustainability assessment as part of a unified concept of sustainable development. Newly introduced frameworks propose mostly case-specific tools that focus on resource use and their impact on the sustainability of farming practices. Combinational use of methodologies is observed in many cases; thus, a clear distinction of methodologies is not always possible. Contributing towards the indexing of the available methodologies, the present paper presented a methodological framework for the systematic literature review of agricultural sustainability studies. The framework synthesizes all the available literature review criteria and introduces a two-level analysis facilitating systematization, data mining, and methodology extraction.
The framework was implemented for the systematic literature review of crop agricultural sustainability assessment studies at farm-level for the last decade. The investigation of the methodologies used is of particular importance since there are no standards or norms for the sustainability assessment of farming practices. The chronological analysis revealed that the scientific community's interest in agricultural sustainability has been increasing during the last three (3) years, indicating a tendency to gradually progress from the theory of economic growth to the more comprehensive and inclusive concept of sustainable development. Nevertheless, the critical evaluation of effectiveness and the implications of the methods presented are outside the scope of the present work and are subjects of thorough future research.
The most used methods include indicator-based tools, frameworks and indexes followed by multicriteria methods. In the reviewed studies, stakeholder participation is proved crucial in the determination of the level of sustainability. However, a systematic assessment of the agricultural machinery's and operation management's contribution to the overall sustainability was not detected in the examined studies. The effect of resource use and input management is the most usually examined issue in the reviewed studies.