Harnessing a ‘Currency Matrix’ for Performance Measurement in Cooperatives: A Multi-Phased Study

The cooperative organizational form is by nature a sustainable one, which has proved to be resilient in the face of crises and a solid lever in addressing present-day societal challenges. Still, little is known about its socio-economic impact. Also, despite the plethora of studies on cooperative performance, research remains inconclusive about how to best measure it. In fact, scholarly work has largely favored the use of appraisal tools reflecting those of investor-owned firms (IOFs), having undermined the dual idiosyncratic nature of the cooperative organizational form, which is manifest in the business and social-membership objectives. The goal of this article is to fill these gaps by delivering a comprehensive dashboard for cooperative performance assessment that harmonizes business–social aspects and catalogs the basic components for future attempts. To reach this goal, we used an extensive review of empirical research in cooperative performance (phase 1) and a Delphi study with 14 experts (phase 2). In addition, we reviewed comparable research efforts for a business form (social enterprises) that combines business with social goals and faces similar challenges (phase 3). This inquiry was particularly insightful for the social perspective and the overlooked role of cooperatives as a socially-embedded organizational form that hardly documents its societal impact and outreach.

other scholars ascribed to this line of research, each suggesting a different single objective that the cooperative (as a separate enterprise) would seek to maximize [39]. Empirical studies of cooperative performance mostly favored the profit-maximizing alternative, treating the cooperative firm as an IOF or an IOF-variant, albeit with different types of stockholders [21]. Not surprisingly, the empirical literature on cooperative performance has been dominated by two categories, with the first consisting of studies utilizing financial metrics, and the second comprising studies engaging in efficiency assessment [22].
We acknowledge that cooperatives have to meet mainstream corporate performance standards for the corporative body to survive (or thrive) as well as to continue delivering member and social benefits [58,59]. However, we attest to the view that success needs to be also appraised in terms of the benefits members receive as opposed to the performance of the cooperative alone [11,29,58,[60][61][62]. Hence, in recognition of the dual nature of the cooperative organizational form, we prepared our preliminary framework along two broad categories. The first addresses more of the business nature of cooperatives and takes the organization as a unit of analysis. It is further divided into three sub-categories. The second broad category addresses the social-membership perspective, takes the member(s) as a unit of analysis, and is further divided into two sub-categories (see Table  1). The first two sub-categories, coded as "business financial appraisal" (BFA) and "business efficiency appraisal" (BEA) respectively, are similar to the dominant ones in the literature mentioned above. The third sub-category, coded as "subjective business appraisal" (SBA), relates to subjective and perceptual performance measures at an organizational level. As for the second set of sub-categories, the first one, coded as "objective membership appraisal" (OMA), is based on objective membership evaluations, while the second, coded as "subjective membership appraisal" (SMA), is based on subjective membership assessments. BFA is grounded on financial (accounting) data typically found in a cooperative's financial statement. Such data reflect the effect of corporate strategic decisions and is customarily used as an input in financial ratio analysis [60,62]. The latter is a standard technique of financial performance evaluation, conveying crucial information on an organization's operations and financial situation [63]. The use in empirical cooperative studies is outstanding (e.g., [24,[64][65][66][67][68][69][70][71]). Financial ratio analysis is used for comparative purposes too (e.g., industry-specific sector comparisons) [72,73]. Strikingly, a large body of work comparing the performance of cooperatives with that of IOFs in the same sector(s) (e.g., dairy, grain, farm supply) is present (e.g., [73][74][75][76][77][78][79][80]). Moreover, some studies (e.g., [81][82][83]) employ sales-based metrics (e.g., market shares, sales growth, the Lerner index) next to financial ratios to paint a more complete picture of financial measures and cooperative performance.
Examining financial data and utilizing ratios provides officials, members, and creditors with a glimpse of the cooperative's strengths and weaknesses. In fact, financial measures have several advantages in terms of collectability, scalability, level of objectivity, and comparability [69,84]. Perhaps their chief virtue is that they are replicated and benchmarked across all types of organizations [38]. However, there are some inherent problems associated with them, particularly with common ratios (e.g., profitability, liquidity, debt ratios). Some problems are intrinsic with the ratios themselves, and some are with the cooperative structure [70,85]. For instance, financial ratio analysis fails to consider that a cooperative can be seen as a vertically integrated entity including Electronic copy available at: https://ssrn.com/abstract=3498873 the members and their businesses [56] or to account for all of the financial effects of management decisions on the collective entity [86]. Also, traditional financial measures and analyses disregard the double role of members (i.e., users and owners) or that members are often paid above the market price for the products they supply to their cooperative [60,73,87]. Furthermore, neither financial measures nor ratio analyses account for the benefits of government support or the value of non-market benefits provided by the cooperative to members or the greater community [62,75]. Notwithstanding the drawbacks, financial measures remain primary in cooperative performance appraisal [22,70,88].

Business Efficiency Appraisal (BEA)
BEA is centered on production function data that is utilized for efficiency assessment and comparisons [89]. The term "efficiency" is used to describe the level of performance that can be reached by an economic unit in accordance with its production possibilities [90,91]. Economic efficiency, in particular, refers to a firm's ability to convert inputs into outputs and respond optimally to economic signals (e.g., prices) [92]. The study of economic efficiency measurement has a longstanding tradition, triggered by the seminal work of Farrell [93]. Farrell identified economic efficiency on top of technical and allocative efficiency. Technical efficiency refers to the ability of a firm to produce the maximum feasible output from a given bundle of inputs (output-oriented) or produce a given level of output using the minimum feasible amounts of inputs (input-oriented) [94]. Allocative efficiency assumes knowledge of the price of the different employed inputs, in order to reach the optimum output at the lowest possible cost [95]. Technical and allocative efficiency, taken together, contribute to the overall economic efficiency of the firm [96]. If a firm is producing on the production frontier, using the optimal proportions of inputs given relative prices, the firm is said to be economically efficient [97].
As efficiency measurement techniques are based on economic theory, studies employing them often use input indicators for labor and capital, while for the output they commonly opt for turnover, sales or assets [88]. Depending on the different functions used (e.g., profit, cost), different efficiency variants might be favored (e.g., X-efficiency, cost efficiency, total factor productivity) [97,98]. Not unexpectedly, efficiency appraisal is rather popular in empirical cooperative studies (e.g., [91,[94][95][96]99,100], while quite a few compare the efficiency of cooperatives with that of IOFs in the same sector (e.g., [101][102][103][104][105]). Except for the various efficiency alternatives, in this sub-category, we also included other efficiency-related metrics commonly used in production or agricultural economics, such as scale and scope elasticities [106] or the comparative cost index [100].
It is notable that the greater accuracy of efficiency measures makes them an appealing alternative to ratio analysis [56]. Nonetheless, large data demands or confidential data (e.g., information on inputs and outputs) make these measures challenging to estimate [62,91]. The estimation becomes even more puzzling when multi-product and/or multifactor productive processes are examined [88]. Most importantly, as efficiency measures require an economic behavioral assumption (e.g., an objective of profit maximization or cost minimization) [92], extant studies view the cooperative as an independent firm with a single objective, neglecting to address the dual nature of the organization [21,27].

Subjective Business Appraisal (SBA)
SBA consists of measures relating to the judgmental assessment of internal or external respondents regarding an organization's performance [107,108]. Studies using these measures rely on survey-based direct elicitation means, following in the tradition of management and marketing studies which regularly employ the key informant method, whereby respondents well informed about organizational issues give answers to item statements [25,38]. These measures usually cover financial and other indicators (e.g., operational, social) and have only been used in a handful of empirical cooperative studies (e.g., [107,[109][110][111]. SBA measurement is often favored when objective data is difficult to obtain or insufficiently reliable [108]. SBA metrics facilitate the assessment of complex issues (e.g., expert's view on Electronic copy available at: https://ssrn.com/abstract=3498873 member satisfaction) [110,112] as well as that of non-financial or non-market aspects [60,109]. Moreover, SBA measurement enables cross-sectional analysis through sectors and markets in general, as performance can be quantified in comparison to objectives or competitors [38,107]. Despite their merits, SBA measures suffer from what their name suggests, namely a certain degree of subjectivity associated with psychological and cognitive biases [38]. In fact, SBA measurement might be plagued by common biases in behavioral research, like systematic error and common method variance [113], particularly when a single respondent provides answers across the survey instrument [114]. Finally, SBA studies might not accurately address the dual nature of the cooperative organization. That is, the indirect measurement of member perceptions only partially integrates the member perspective [25].

Objective Membership Appraisal (OMA)
OMA encompasses metrics relating to observable membership characteristics [29,115,116], particularly with respect to user-benefit and user-control arrangements. More specifically, this subcategory relates to pricing, delivery, services, and governance data, like prices paid to members by the cooperative, the percentage of in-selling (or side-selling), the scope and quality of services members receive, and the governance systems and procedures (e.g., CEO tenure, secret ballots, audited accounts, available information to members). In agricultural cooperatives, this sub-category may additionally cover features commensurate with patronage and the members' farms [117][118][119], such as farm financial ratios, profits obtained, productivity, and efficiency. One of the reasons why farmers join cooperatives is that they routinely face considerable risk of income variability, often due to monopolistic exploitation (e.g., price discrimination) from upstream or downstream partners [31,39]. Consequently, success at the farm level is naturally also contingent on cooperative membership and can, thus, be partially estimated based on patronage-related data [120,121].
OMA metrics showcase what benefits members receive as well as to what extent members support their cooperative in return [122]. They are based on objective data and, if cooperative registries are present or if the cooperative statutes are readily available, OMA information can be directly sourced. In the absence of such sources as well as when farm-level data is sought, surveybased methods (e.g., structured questionnaires) are used instead [119], which often make the data collection process somewhat troublesome, as data access might condition the consent of cooperative officials or members themselves [116]. Moreover, OMA measures in isolation cannot truly address the dual nature of the cooperative organization; neither do they account for the performance of a cooperative as an entity nor reflect all member benefits (e.g., satisfaction with membership aspects). In reality, they do not integrate member perceptions, but rather member conduct, outward userbenefit or user-control arrangements, and farm performance.

Subjective Membership Appraisal (SMA)
SMA comprises measures relating to the judgmental assessment of cooperative members regarding the benefits they receive from membership and their cooperative's performance in general [123,124]. These measures habitually cover members' general stance towards the cooperative (e.g., overall satisfaction, intention to continue membership) [125,126], members' evaluation of financial aspects (e.g., satisfaction with price or market arrangements) [29,127], and members' evaluation of non-monetary membership aspects (e.g., members' influence on internal decision-making, satisfaction with information flow) [123,128]. In the vast majority of the few empirical cooperative studies that rely on SMA measures (e.g., [125,126,129], multi-item scales are commonly favored. The latter are usually drawn from constructs developed and validated in mainstream marketing or management studies [127,130].
SMA measures facilitate the direct assessment of member benefits, unveiling how members think and feel towards their cooperative or even how they might behave in the future [123]. Also, SMA measures can capture non-pecuniary and non-market aspects of cooperative behavior [124]. Nevertheless, SMA data might be difficult or time-consuming to obtain, as it requires the consent and willingness of members to participate in field work, which might be challenging for producers Electronic copy available at: https://ssrn.com/abstract=3498873 or members of advanced age [126]. Moreover, similar to SBA metrics, SMA measurement might suffer from cognitive and psychological biases [38,113]. Finally, SMA measures alone cannot address the dual objective nature of the cooperative organization, as they do not account for the latter's performance as an entity. Members' benefits are naturally conditioned by the cooperative's achievements [112], so SMA metrics might mainly be reflecting rather than assessing organizational performance.

The Cross-Fertilization Potential with Social Enterprises
Social entrepreneurship is a way of addressing societal needs through the utilization of economically sustainable market strategies [131,132]. Social enterprises are social mission-driven organizations that trade in goods or services for a social purpose [133,134]. They are typically positioned between profit and non-profit organizations [135]. On the one hand, they differ from the former (hence also IOFs) as profit is a means to create social value rather than an end per se. On the other hand, they present an alternative to non-profit models which are naturally dependent on grants and donations [136]. In the past couple of decades, social enterprises have attracted considerable practical and scholarly interest [137,138], even though they belong to a relatively nascent area of research [139]. The growing interest in them is consistent with the mounting pressure on business organizations to spur positive social change by engaging in social or environmental initiatives [140].
So, social enterprises have a propensity to blend for-profit practices with non-profit ones, although they are neither typical charities nor traditional businesses like IOFs [141]. Of course, to address their core mission and, thus, optimize the creation and distribution of social value, they have to forego financial returns or reinvest them [132,142]. Combining business and social goals, they form part of the so-called 'social economy sector' which consists of those organizations that do not belong to the public and private sectors, like non-profit associations, mutual societies, and cooperatives [41,131]. In fact, social enterprises are considered hybrid organizations whose defining characteristic is the duality of social impact alongside financial sustainability [134,136,139]. Together with cooperatives, whose hybrid identity is inherent [35], they consistently demonstrate how to thrive as hybrid organizations attending to competing business-social demands [137,143].
Admittedly, social enterprises and cooperatives have many commonalities. They both have to be business-like and meet financial and commercial goals on top of their social ends [144]. They are both seen as promising vehicles for the creation of social and commercial value, as through their business ventures they offer a ray of hope in a world filled with longstanding socioeconomic and environmental issues [9,136,137]. Similar to cooperatives who fill provision gaps [2,35,39], particularly in disadvantaged areas, social enterprises help those left behind and serve markets habitually underserved by IOFs or governments [139,145]. Actually, both social enterprises and cooperatives have a potential to be architects and the engine of genuine social innovation [131], principally through the creation of business-social networks necessary to stimulate social change [36,132].
By the same token, cooperatives and social enterprises face a number of common challenges. First of all, the commercial activity of social enterprises might reduce their attention to the social mission [142], similarly to cooperatives, where business emphasis increasingly tempers their social character [37]. In other words, in their efforts to generate revenue, social enterprises run the risk of losing sight of their social missions, subjecting themselves to mission drift distress [132,139,140]. This concern echoes one of the profound trends in the social economy sector, namely steady rationalization and marketization [142,144,146]. In cooperatives, this trend has resulted in governance changes (e.g., reduced member involvement) [34], and a social capital drain [33]. In addition, focusing on both social and economic outcomes sets the stage for various forms of organizational tension (e.g., belonging, performing) [137], perplexing performance measurement too [147]. Performing tensions emerge from the divergent outcomes social enterprises deal with, such as the varied goals they need to set, the different metrics they have to employ, or even the inconsistent stakeholder demands they are compelled to satisfy [134]. For example, as performance Electronic copy available at: https://ssrn.com/abstract=3498873 evaluation extends to both social and financial operations [133], it is hard to sustain support for both social and financial metrics [137]. Undoubtedly, pecuniary indicators are crucial for evaluating sustainable organizational progress, yet, assessing the non-financial performance is arguably equally important to ensure the core mission is met [135,148]. Considering that cooperatives are also confronted with similar performing tensions and, given the commonalities identified [147], it seems instrumental to investigate how literature on social enterprises has tackled the complex issue of performance assessment and, thereby, inform the inquiry for cooperative organizations.

Materials and Methods
To reach the objective of our study, we divided our research process into three phases. In the first phase, our aim was to obtain an overview of relevant performance indicators and prepare the preliminary categorization detailed above. Therefore, we performed an extensive literature review and delimited the material according to the topic of the present article. In the second phase, our aim was to screen the sub-categories of the first phase and decide upon an acceptable dashboard. We used the Delphi technique to seek convergence on opinions from domain experts. In the third phase, we performed a literature review on the performance of social enterprises. We aimed at comparing the performance dashboard with research efforts for social enterprises and informing it with potentially overlooked or complementary indicators. Table 2 gives an overview of the three phases of the research process.

Phase 1
In phase 1, we followed review procedures drawn from scholarly work on performance and sustainability measurement research [13,16,17]. We only considered contemporary research, demarcated as scholarly and practitioner efforts involving performance measurement frameworks or metrics since 1980. To derive an initial population of articles, we conducted electronic keyword searches in major bibliographic databases, such as "AgEcon", "JSTOR", "Web of Science", "ScienceDirect", "WorldCat", "EBSCOhost", "Scopus", and "Academic Search Premier". Three of the authors and three experts on the topic (i.e., in terms of numbers of studies conducted, papers published and reviewed, and familiarity with specific journals covering cooperative research) developed the keyword search strings, namely "performance measurement", "performance appraisal", "performance evaluation", "performance assessment", "efficiency", "cooperatives", and "credit unions". To expedite the identification of relevant journal papers, we restricted our focus on the articles that included one or more of the search terms in the title, abstract or keywords, along with the term "cooperatives" or "credit unions". We also consulted "Google Scholar" and, thus, conference proceedings, industry briefs, and policy reports were reviewed too, provided that the publication was in English and under the auspices of a well-established organization (e.g., USDA, Washington, DC, USA) or association (e.g., the Agricultural and Applied Economics Association-AAEA, Milwaukee, WI, USA). Finally, we detected overlooked sources with the aid of the three experts. Our extensive investigation revealed a notable array of research over the last decades. Each document was then examined to classify only those that contained an explicit performance framework or metric(s) for cooperative organizations. All documents were double-coded by two of the authors as well as another coder with experience in cooperative and organizational research.

Phase 2
In phase 2, we employed the Delphi method. This is a popular technique used for the solicitation and aggregation of informed judgments from experts within specific topic areas, developed by the Research ANd Development (RAND) Corporation in the 1950s and 60s [149][150][151]. In effect, it is a systematic process that seeks to achieve convergence on real-world opinions from a group of experts on certain (research) question(s) [152,153]. Opinions are gathered through multiple survey rounds, allowing and encouraging the selected experts to reassess judgments provided in previous iterations [154]. So, in each round, the participants are asked to answer questions individually and anonymously, while, after each round, responses are statistically summarized and reported back to them, giving them the chance to revise their answers [149,152]. As a result, every iteration forms the foundation for the next, and the process, which is guided by a skilled moderator, continues until a consensus or a set level of stability in answers is reached [153]. As the anonymity of contributors is maintained, and their feedback is monitored throughout the process, the Delphi method prevents groupthink, minimizes the influence of dominant individuals, and reduces (statistical) noise [149,150]. Not surprisingly, since its inception by Dalkey and Helmer [152], it has enjoyed a long tradition as a research and management decision tool [151], even though it has hardly been used in cooperative studies (see [154] for an application).
As the Delphi technique does not make use of a random sample of the target population [152,153], we applied a purposive sampling method, identifying potential participants through publications, personal contacts, peer recommendations, research conference lists (e.g., ICA global conferences), and affiliations with organizations active in the field of cooperatives (e.g., research institutes, non-governmental organizations, consultancy firms). To reflect the variety of geographic contexts in cooperative performance research (see Section 4.1 below) and to ascertain that responses represented various possible standpoints (e.g., academic, practical, policy)-in line with the past application of the Delphi method in cooperatives (i.e., [154])-we collected expert judgments from a diverse panel. So, to assemble the panel and ensure diversity, the final list of experts was stratified according to sectors (e.g., public, private, and not-for-profit), geographic regions, gender, and field of cooperative expertise. An e-mail invitation was sent to 42 experts, along with a cover letter containing a short description of the Delphi process, a proposed timeline, and a brief outline of the research objectives. After a reminder e-mail, 17 experts agreed to join the panel. The final pool of panelists included 11 males and 6 females. Although most of them (N = 8) came from North America, they were somewhat geographically dispersed: four were Europeans, three were from Latin America, and two from Africa. Seven panelists were academics (e.g., University faculty members), three were senior managers at consulting firms (e.g., agribusiness consultants), three were officials at governmental organizations (e.g., USDA), two were senior managers of not-forprofit organizations (e.g., development organizations), and two were executives of financial institutions (e.g., a credit union). The majority (N = 10) of panelists held a doctoral degree, and all of them had experience in the topic of cooperative performance on top of a proven track record of cooperative expertise (e.g., significant research output, extensive advisory work).
The actual Delphi study was implemented online, in three rounds. In all iterations, communication was standardized, safeguarding that all panel members received identical information. To reduce over-confidence bias, we also asked experts to report their degree of familiarity with the overarching topic. In round 1, we administered an online survey asking the experts to screen and validate the performance sub-categories confirmed in phase 1 as well as select which ones they would use for measuring cooperative performance along three criteria (i.e., ease of data collection, usefulness, and applicability across contexts). In addition, the most common indicators for each sub-category identified in phase 1 were given as examples, while participants could also suggest new metrics or even new sub-categories. In this round, we used the "average percent of majority opinions" (APMO) cut off rate as a consensus measure [150]. Based on the latter, responses were summarized and sent back to participants for review in round 2. Through discussion and revision, a consensus was reached by narrowing the survey to three sub-categories and eight indicators that served as the content for the round 3 survey tool. In round 3, four Electronic copy available at: https://ssrn.com/abstract=3498873 participants decided to drop out, and the remaining 14 were asked to determine the suitability of the eight indicators on a 5-point Likert scale. Levels of agreement among participants were determined using simple measures of central tendency as a consensus criterion [153]. In this round, a general consensus was reached and, thus, we decided to stop further deliberations.

Phase 3
Even though the past decade has witnessed a surge of scholarly interest in social entrepreneurship and social enterprises, it was not until the same decade that such research became an influential literature stream [137,138]. Hence, before conducting the review on the performance of social enterprises, we could expect that perhaps the sheer number of works devoted to the topic at hand would be smaller than that anticipated for cooperatives. Considering that social enterprises were not the focal business form of this article, we restricted ourselves to including peer-reviewed articles (in English) that specifically and explicitly stated social enterprises as their main research topic. So, we consulted the same databases as in phase 1 (with one exception) and searched for articles containing the terms "social enterprise" or "social venture" in the title, abstract, or keywords, along with the terms "performance measurement", "performance appraisal", "performance assessment", "performance evaluation", and "efficiency". All documents were double-coded by two of the authors.

Phase 1
Our review resulted in a sample of 139 empirical works (i.e., 121 journal articles, eight conference proceedings, six book chapters, and four reports) and four guides. The vast majority of the empirical studies examined agricultural sectors (i.e., ≈85%), a few more than 15% related to retail banking, and less than 5% investigated other sectors (e.g., industrial, consumer). A third of the studies focused on the United States (USA), a bit more than a third (i.e., 37%) considered European countries, and the rest centered on countries from Asia (e.g., India, Japan, China), Africa (e.g., Ethiopia, Kenya), Latin America (e.g., Brazil, Costa Rica), and Australia or Canada. Interestingly, most research drew samples from the dairy sector (29%), followed by the grain sector (25%), farm supply (25%), and fruit and vegetables (21%). Moreover, almost 20% of studies compared cooperatives with IOFs, with the rest focusing solely on cooperatives or cooperative members. In Table A1 in Appendix A, we present all studies across the sample profile (e.g., country, data period, number of cooperatives) and sector(s). Of course, we also present the sub-categories in which each study was classified next to the metrics employed. In addition, at the bottom of Table A1, we present the metrics proposed by the four guides, the sub-categories these metrics belong to, as well as the countries and sectors to which they are applicable or have been designed for. Table 3 below provides a summary overview of all the reviewed work (i.e., both the empirical studies and the guides) across the five sub-categories of the preliminary framework.  Tables 3 and A1 reveal that the largest number of empirical studies (i.e., 58%) could be classified as BFA. Unsurprisingly, some studies utilized sales-based metrics (e.g., market shares, sales growth), but the overwhelming majority used financial ratios. The latter could be further divided into two main sets. The first consists of profitability and efficiency ratios illustrating the ability of equity capital to generate returns as well as indicating how effectively assets are utilized [74,86]. The second set, which contains leverage, solvency, and liquidity ratios, concentrates on metrics that show the nature of financing equity capital and the ability of the cooperative to pay its debts in the long run (i.e., solvency, leverage) or to meet its short-term obligations out of liquid assets (i.e., liquidity) [63,155]. Moreover, a few studies (e.g., [67,82,156]) employed export-oriented ratios, such as the export intensity ratio (i.e., export to total sales) or the degree of internationalization ratio (i.e., foreign sales to total sales). Finally, many studies devoted to retail banking (e.g., [157][158][159][160]) made use of banking-specific ratios like the loan ratio, often on top of examining the traditional ones.
The sub-category also recurring quite often in the literature was that of BEA. Notably, almost every third article entailed efficiency assessment metrics. As expected, most contributions favored technical and allocative efficiency, but different efficiency variants were also used (e.g., cost efficiency, scale efficiency, total factor productivity). Furthermore, as explained in Section 2.2.2., in the BEA classification, other efficiency-related metrics could be located, such as the marketing margin per unit of capacity [161] or the comparative cost index [100].
In contrast to the BFA and BEA sub-categories, the attention on the remaining three has been somewhat skewed. Except for an early application from Babb and Boynton [87], it was not until the last decade that SBA, OMA, and SMA metrics were first employed (e.g., [29,107]). In fact, their use only proliferated in the past five years or so, even though some metrics (e.g., satisfaction, perceived performance by key informants) were drawn from mainstream management or marketing studies, the domain of which has exemplified a decades-long tradition in such use [38]. In total, all three sub-categories accounted for not more than one-fourth of all reviewed studies. In the SBA subcategory, the most common metric adopted related to key informants' (e.g., CEO, board chair) perceptions about overall performance or performance aspects (e.g., how satisfied members are). In the OMA sub-category, the whole range of observable membership characteristics identified in the preliminary framework could be spotted, from user-benefit arrangements (e.g., prices paid, quality of services) or user-control features (e.g., governance procedures) to patronage-related data (e.g., farm profitability). Yet, side-selling appeared to be the most commonly reported measure. The SMA sub-category was dominated by metrics related to overall member satisfaction or satisfaction with membership aspects (e.g., technical assistance, pricing policies, information flow), followed by loyalty measures (e.g., intention to continue membership).
Finally, a handful of papers (e.g., [7,146,147]) also included metrics not directly belonging to any of the five sub-categories but rather concerning the environmental performance or the impact on internal (e.g., employees) and external stakeholders (e.g., the community), such as the employment size and the community payments ratio (i.e., community expenditure to total assets). On the contrary, the four performance guides (i.e., [19,[162][163][164]) propose a considerable amount of metrics relating to social or environmental value, such as indicators for community involvement and development (e.g., amounts granted for donations, scholarships and sponsorships), employee benefits (e.g., salaries, training, hiring practices), and environmental impact measures (e.g., emission and waste reduction). Similarly, all of the guides elaborate on the OMA sub-category, highlighting the social-membership perspective and the importance of capturing member benefits.

Phase 2
In round 1, respondents were given three weeks to complete the online survey. As pointed out in Section 3.2., experts were first asked to assess their familiarity with cooperative metrics on a 7point Likert scale, partly as a means of curbing over-confidence bias. It turned out that the panelists rated themselves high on average (M = 5.71, S.D. = 1.16), albeit at a reasonable rate. They were then Electronic copy available at: https://ssrn.com/abstract=3498873 asked to answer how "easy it is to collect data for the <<sub-category>>", how "useful is the <<sub-category>>" and how "applicable is the <<sub-category>> across contexts".
Respondents could answer whether they agreed or disagreed, generating a potential maximum set of 255 responses. To determine the level of consensus for these responses, we applied the APMO method (see [150] for an overview). This is expressed as: APMO = [(majority agreements + majority disagreements)/total opinions expressed] × 100%, According to this method, a statement must achieve a percentage for "agreement" or "disagreement" that is higher than the APMO cut-off rate. The latter is calculated as follows: first, the number of majority agreements and disagreements is computed by expressing the participants' answers in percentages per statement. A majority is defined as a percentage above 50%. Second, the majority "agreements" and "disagreements" are summed up. Third, these sums are divided by the total number of opinions expressed to calculate the APMO cut-off rate. Any item below the cut-off rate may enter round 2 for re-evaluation.
To calculate the APMO rate for the first round, we used the 15 statements generated by the three questions presented above (five sub-categories multiplied by three questions). So, 113 majority agreements plus 50 majority disagreements (only those >50% are summed) were divided by the total of 252 opinions. This resulted in an APMO rate of 64.68%. As we can see in Table 4, nine statements during the first round reached a percentage of (dis)agreement that was higher than 64.68%, and thus reached a consensus. More specifically, a consensus was fully reached for the SMA sub-category. A consensus was also partly reached for the BFA and OMA sub-categories, in two out of three criteria. That is, the panelists could not clearly agree or disagree if it is easy to collect data for BFA and OMA. In contrast, they did agree that data collection is not easy for BEA. They could not reach a consensus for BEA along the other two criteria, however. Likewise, no consensus was reached for SBA along any of the three criteria.  1 The suffix "_e" stands for "ease of data collection" (question 1), the suffix "_u" stands for "usefulness" (question 2), and the suffix "_a" stands for "applicability across contexts" (question 3).
In round 2, the panelists reached an agreement regarding the contested cases of the first round. That is, after being sent the summarized responses and through discussion, they decided that the SBA and BEA sub-categories should be eliminated (see Table 5). They did retain the BFA and OMA ones, acknowledging that data collection is not easy but definitely easier than for the eliminated sub-categories. Furthermore, in this round, the panelists agreed to carry on with the most common indicators identified for BFA, OMA, and SMA (see below). Finally, no new sub-category was put forward in any of the first two rounds, while the few additional metrics suggested by experts were already identified in phase 1. In round 3, three experts decided not to continue. The rest were asked to rate the eight metrics approved from the previous round. To determine the consensus level, we used the mean as an orientation criterion and the standard deviation (SD) as a level criterion. SD values below 1 were deemed as "high" [153]. As we can see in Table 6, but for two metrics, all other reached a high level of consensus. In fact, the two metrics that failed to do so appeared to have the lowest means too. Of course, one of the BFA metrics (i.e., profitability ratios) only marginally fulfilled the consensus level criterion. All in all, shortly after gathering and analyzing round 3 responses, we reckoned that phase 2 objectives were met and, thus, decided not to proceed to a fourth round.

Phase 3
As expected, our review of the literature on the performance of social enterprises confirmed that approaches to measuring performance within social enterprises remain in the early stages [136]. Not surprisingly, the sheer number of articles measuring or merely conceptualizing performance in social enterprises compared to the volume we generated in our review of the empirical work on cooperatives was somewhat small (see Table A2 in Appendix B). Moreover, we found no study focused on the agricultural sector. Of course, as social enterprises use a business logic to improve the situation of population segments that are disadvantaged or even excluded [138], it should not be surprising that almost all reviewed studies were devoted to socially-oriented sectors, such as those of work integration and social care. Interestingly, quite a few studies (e.g., [133,135,141,165,166]) included cooperatives in their samples and treated them as social enterprises. Perhaps, as numerous social cooperatives providing socially-oriented services (e.g., work integration, healthcare) can be found in many countries [147], such identification with social enterprises can be anticipated, although it should be avoided.
As far as metrics are concerned, early work concentrated on adaptations of Kaplan and Norton's [167] balanced scorecard, deploying strategic objectives into operational ones in order to determine how social value is created [168]. A handful of studies appealed on financial data, in line with BFA metrics, while others used or developed subjective measures (e.g., key informant's view on economic and social performance), which in turn could be directly compared to SBA metrics.
Electronic copy available at: https://ssrn.com/abstract=3498873 Not unexpectedly, all studies used some indicators designed to capture social value (e.g., social performance), even though almost all of the studies recognized the challenge of assessing it as opposed to financial performance. Still, two models that concentrate on social value but also blend it with economic inputs and outputs clearly prevailed.
The first one is the social return on investment (SROI) and is part of the synthetic type of metrics, which aim to provide a global performance assessment of a social organization [148]. The SROI model was developed by the Roberts Enterprise Development Fund and is based upon the principles of cost-benefit analysis [141]. By analogy with its business counterpart (i.e., the return on investment), it measures the value of social benefits created by an organization in relation to the cost of achieving those benefits [148]. In other words, it is a measure that monetizes outcomes, comparing the (monetized) social costs of a program with the (monetized) social benefits of achieving an outcome [169]. As a synthetic indicator, the SROI model seeks to merge financial and social value with a view to formulating a single parameter representing the social enterprise's performance [145]. Similarly to the second dominant model (i.e., the "logic model") below, it puts those affected (i.e., the beneficiaries) at the heart of the measurement process [170].
The second model is based on the so-called "logic model" of assessment (or impact value chain model), a process-based model centering on the process of 'production' of a social service/product [168]. The "logic model" was originally developed for USAID in the late 1960s and has its roots in the evaluation of programs and projects [171]. It articulates indicators and metrics into inputs, outputs, outcomes, and impacts [145]. Organizational inputs (e.g., equipment, funds) are used to support activities or processes for the production of goods and services that in turn result in the delivery of outputs to a target beneficiary population (e.g., number of people benefitting) [142]. These short-term outputs are expected to lead to improved outcomes in the lives of beneficiaries typically measured in terms of medium-and long-term benefits (e.g., increased incomes, social integration) [171]. The component of impact usually refers to the consequences for the wider community, acknowledging the secondary effects that may accompany the outcomes (e.g., community benefit due to social integration) [133]. In short, the "logic model" and its variants used by the studies at hand are centered on the beneficiaries, but implications for the wider community are often integrated, even though the causal link between outcomes and impact might not be apparent or go beyond the control of the social enterprise in question [135].

The 'Currency Matrix'
In harnessing the "currency matrix" for the performance measurement of cooperatives, we "amalgamate" the findings from the three phases in a concrete dashboard, even though we do not narrow down the scope to the exact metrics singled out in the Delphi study. In phase 1, it became clear that, despite the dominance of the business sub-categories (i.e., BFA and BEA), the socialmembership perspective, represented by OMA and SMA, has entered the lexicon of empirical research in cooperative performance and is gaining increasing attention. Yet, any performance assessment endeavor cannot afford to disregard the business perspective, particularly the BFA metrics that apply to cooperative and non-cooperative contexts alike. Moreover, phase 1 findings suggested that hardly any efforts are made to empirically assess cooperative impact beyond cooperative boundaries (e.g., benefits to the community). In phase 2, cooperative experts helped to "hammer" the assessment components and imprint them into a three sub-category dashboard. As we can see in Figure 1, the BFA element reflects the business aspects, and the SMA constituent conveys the social-membership viewpoint. Together, they do justice to the dual objective of the unique cooperative organizational form. However, the OMA addition solidifies both components, exemplifying in observable terms what members receive but also what they partly contribute to keeping their cooperative enterprise in business.
Electronic copy available at: https://ssrn.com/abstract=3498873 Consequently, even though integrating measures from BFA and SMA would probably suffice to obtain a firm view on cooperative performance, complementing them with OMA metrics helps paint a complete picture. Additionally, users may employ the metrics that comprise each constituent (M1, M2 … Mν in Figure 1) depending on their context characteristics. Interestingly, in phase 3, it became evident that the social aspect takes center stage in the scholarly work on the performance of social enterprises. Emphasis is placed on the beneficiaries, but societal implications beyond the recipients' frontiers are accounted for or at least considered. In phase 1, only the performance guides concentrate on social aspects. Hence, phase 3 findings and the limited attention of phase 1 results suggest that the ground for the social perspective-in membership terms and beyond-is undoubtedly fertile for a genuinely socially-embedded business form like cooperatives, particularly when attempting to unveil their actual socio-economic impact.
Finally, the three sub-categories are glued to each other. Even though they are based on distinct metrics and are ostensibly independent, they are essentially interdependent. Yet, they should not be treated as an all-inclusive index, and they cannot probably result in a single supreme indicator. Preferably, together they epitomize a "form for a medium of knowledge exchange" (the "currency matrix"). This medium enables "users" (researchers or practitioners) to pick the "exact units" (metrics) that generate "global values" (scores) that ultimately empower them to "trade" (exchange) their findings in the knowledge "marketplace". If the "currency matrix" is duly utilized, findings on cooperative performance may become easily "interchangeable" rather than risk ending up isolated. Moreover, as the three sub-categories are fundamentally symbiotic with the social impact aspect, adding social value measurement elements opens up the exchange of ideas or results past the cooperative "universe". As a result, we anticipate that studies employing metrics from all three components as well as assessing social impact will be in a better position to capture cooperative performance comprehensively and at the same time produce a fruitful dialogue.

Discussion
In this paper, we aimed at delivering a performance dashboard for cooperatives that could be comprehensive and simultaneously consistent with the dual nature of the distinctive cooperative organizational form. In so doing, we began with an analysis of a preliminary framework, in which we detailed five sub-categories and documented their advantages and shortcomings. Then, in phase 1 we reviewed an impressive body of empirical work and validated the preliminary framework. In phase 2, we integrated the input from experts in the field, and through multiple iterations transformed the framework into a concrete three-sub-category dashboard. In phase 3, we explored comparable work for a business form (i.e., social enterprises) that also straddles business with social components and faces similar business-social challenges. This inquiry encouraged us to fortify the social perspective of the dashboard. Moreover, based on what has been most commonly used in the literature as well as on what the experts singled out, we proffered a manageable bundle of metrics for each of the three sub-categories, even though neither did we aim to prepare a global Electronic copy available at: https://ssrn.com/abstract=3498873 performance measure nor to direct future work into particular metrics. Instead, our dashboard covers the assessment constituents that can be considered representative of the cooperative organizational form and fundamental for measurement endeavors. Hence, it may serve as a common benchmark (a "currency matrix") for future empirical studies or at least trigger more inquiries that look into both the business and social perspectives.
Our finding that studies have only recently paid attention to the social perspective coupled with the absence of impact assessment beyond the cooperative boundaries, in sharp contrast to research on social enterprises, warrants further investigation. It is already surprising that cooperatives have been unable to disseminate their competence in creating both commercial and social value, particularly in light of the estimation of the International Labour Organization (ILO) that the livelihoods of nearly half the world's population are secured by cooperatives [6] or despite the annual reporting by the World Cooperative Monitor [20]. Therefore, we suggest that future research accommodates the assessment of far-reaching social impact too. Perhaps, when scholars and practitioners consider what to assess or what to report, they should embrace the quote from Pericles: "What you leave behind is not what is engraved in stone monuments, but what is woven into the lives of others". In other words, cooperatives will be in a better position to demonstrate they are an effective tool for the sustainable social development if cooperative scholars and managers engage in systematic evaluation of social value too [40].
A central strength but also limitation of this study is the focus on the agricultural domain. At the outset of the paper, we explained that we chose to concentrate on this domain, given the robust market presence agricultural cooperatives exhibit worldwide, the policy support they enjoy in several countries, and the marked attention they have attracted in the specialized academic literature. In reality, we did consider all sectors and reviewed related work, but, not unexpectedly, we found that almost 85% of the 139 empirical studies at hand were entirely or partly devoted to agricultural cooperatives. We acknowledge, however, that future studies may not be in a position to pick certain metrics out of those proffered (e.g. side-selling). A solution for researchers would be to favor the sub-categories of the proposed dashboard, albeit select or adapt those metrics that suit their contexts. For example, in phase 1 we showed that some studies which examined retail banking cooperatives employed banking-specific financial ratios. So, we could suggest that, regardless of the subtype (e.g., consumer, purchasing, financial, housing), researchers could utilize the "matrix" to assess performance, as long as they make the right metric selections and the right adaptations. We expect that the OMA sub-category would probably call for particular attention (e.g., the metric "prices paid" would need careful interpretation), whereas the BFA and SMA sub-categories would require less effort. For example, measuring "member satisfaction" across subtypes or calculating financial ratios would be a relatively uncomplicated undertaking.
Similarly, as Franken and Cook [27] have pointed out, the correspondence between different metrics might be contingent on the type of the cooperative (e.g., multipurpose vs. supply), which in turn might be bound to the sector(s) (e.g., dairy vs. grain) that the sample in question is associated with. More research is definitely needed to explore a better alignment between the different contexts and the various metrics, also in line with the calls from mainstream management research [16,38]. Moreover, following sustainability studies' convention to treat stakeholders as an integral part of the measurement process [13], future research could more systematically involve internal and external stakeholders in the cooperative performance assessment process and, thereby, develop a taxonomy of (apt) metrics by stakeholder type. Of course, as the core stakeholders (i.e., the members) routinely exhibit substantial heterogeneity in their preferences [30], it is rather perplexing to satisfy their interests, let alone to balance the diverse concerns of the varied stakeholders. Nonetheless, accounting for the inherent heterogeneity in stakeholder preferences when measuring cooperative performance, will permit a richer understanding of cooperatives' socio-economic impact on top of expediting a dynamic configuration between research contexts and metrics.
Furthermore, it could be promising to examine our suggested dashboard and different metrics through the prism of the cooperative life-cycle framework [172,173]. The latter encapsulates the business and social perspectives, among others, and assesses cooperative "health" over five Electronic copy available at: https://ssrn.com/abstract=3498873 sequenced phases through a bundle of metrics (e.g., prices paid, services, feeling of community) that tie finely with our dashboard. Perhaps deploying the dashboard constituents and associated metrics along the five phases would help researchers to interpret performance outcomes more accurately and understand the interconnections between the constituents for each phase soundly. In practice, coalescing our dashboard with the life-cycle framework could probably assist cooperative leaders in making informed decisions, particularly in the final phase, where they have to make a "choice" that determines whether their cooperative can go through succeeding life cycles.
In conclusion, while we believe we have succeeded in providing academics and practitioners with a "currency matrix" of cooperative performance measurement to rely on, we see an opportunity for scholars to advance the performance debate and possibly provide a concluding touch, as long as they do not disregard the (dual) nature and the (social) roots of the idiosyncratic cooperative organizational form. We hope we have made a small step toward convergence in understanding cooperative performance assessment and in facilitating future scientific comparisons. Cooperatives are well-placed to contribute to sustainable development, although, to render their contribution visible universally, they first need to be well-equipped to quantify their impact consistently.
Author Contributions: All of the authors contributed substantially to the conception of the paper, and jointly defined the methodology and delimitations of the study. T.B. also contributed by preparing the background research, performing the reviews and the data analysis, as well as drafting the manuscript. N.K. also contributed by acquiring and preparing the data, providing analysis and interpretation, and conducting critical revision. M.W., K.d.R., and J.M.E.P. also contributed by editing and critically reviewing the work. Finally, all of the authors contributed substantially to the conclusions of the paper.

Conflicts of Interest:
The authors declare no conflict of interest. The funding sponsor had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, and in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript: Work integration and community services (e.g., social tourism, bulk waste, bike rental) a. Financial statement analysis b.
Social effectiveness-a variant of the "logic model" of assessment/impact value chain model (i.e., sustainability of inputs, outputs-activities, outcomes to intended beneficiaries, social and economic impacts on the wider community) c.
Institutional legitimacy (institutional coherence, compliance with laws and secondary norms) Millar and Hall (2013) [148] Health and social care a. SROI b.
Internal tools (not specified)

Arena et al. (2015) [141]
Energy production and distribution A variant of the "logic model" of assessment/impact value chain based on inputs, outputs, and outcomes, and exemplifying three dimensions: efficiency (output/input), effectiveness (output characteristics), and impact (long-term effects of the output on the target community) Battilana et al.