Responsibility beyond the Board Room? A Systematic Review of Responsible Leadership: Operationalizations, Antecedents and Outcomes

: For more than two decades, researchers have aimed to measure responsible leadership. This has resulted in several survey instruments and parallel streams of research, making it difﬁcult to carve out the core. We systematically review 28 studies measuring responsible leadership (RL). A qualitative content analysis of RL survey instruments is conducted to identify the core aspects across measures, as well as synthesizing the evidence, mapping antecedents, and mapping the outcomes of RL. Findings show that 24 studies in the sample were published during the last two years, indicating a growth spurt in the ﬁeld. Most survey instruments on RL measure leadership of the individual direct leader, while a few have a wider focus, such as leadership of the organization. Four themes were identiﬁed across RL survey instruments: accountable role model, inclusive facilitator, inventive planner and benevolent value creator. Our review contributes to the establishment of a joint platform for future research. In addition to providing a systematic account of evidence, our analysis points at research gaps and gives a basis for a critical discussion on nature as a stakeholder. Avenues for future research are outlined.


Introduction
Corporate scandals in global business, such as the Enron collapse, Volkswagen's Dieselgate scandal and the fracking accidents of Equinor, have contributed to a wide-ranging discussion on businesses' roles and responsibilities in society over the past few decades.
The quest for responsible leadership (RL) is not only an answer to business scandals, but also a response to changes in global society, climate change and new responsibility demands (e.g., UN sustainability goals). With the growing environmental challenges around the world, there is pressure from various stakeholders such as governments, consumers and NGOs for corporations to act in the role of global citizen (e.g., [1,2]).
Following the increased stakeholder focus, many companies have taken on some sort of corporate social responsibility (CSR) initiative (e.g., [3][4][5]) in their efforts to contribute to the triple bottom line of social, environmental and economic sustainability [6]. Yet, a lack of trust in leaders across sectors has been detected over the last several years, and recent reports have suggested a growing belief that capitalism, in its current form, is doing more harm than good in the world [7,8]. It has been argued that the economic crises taking place at the start of this century were in fact crises of leadership ethics [9]. Irresponsible lending practices of US financial institutions, together with numerous instances of corruption, led to the near breakdown of the global financial system in 2008. Consequently, the ethics and Sustainability 2021, 13, 10298 2 of 40 responsibility of leaders and the ability to identify and rebuild the missing trust have been attracting much attention both in business and in academia [10].
Corporate leaders are arguably important social agents for the performance of socially responsible businesses [11], and they play a key role in planning and executing CSR initiatives [12]. While some leaders may consider the CSR initiatives as a necessary financial trade-off to maintain good stakeholder relations and the company's reputation, other leaders may assume that the competitiveness of their company and the health of the community around it are mutually dependent. This latter assumption is related to concepts such as blended value creation [13] and the creation of shared value (CSV) [14], where leaders develop deep links between their business strategies and their CSR initiatives, therefore combining ethical and strategic concerns.
Early research on responsible leadership has described and explored both irresponsible leadership cases such as in the Enron collapse [15], along with responsible leadership cases such as leadership of the Body Shop [16], as well as conceptual studies of this new construct (e.g., [17,18]). For two decades, researchers have strived to operationalize RL, resulting in several measures based on somewhat different conceptualizations of the phenomenon. These have been applied in an onset of explorations on antecedents and outcomes of RL. However, from a research standpoint, there is little accumulated knowledge, as there seems to be several parallel streams of research using different conceptualizations and different operationalizations, making it difficult to compare findings across studies and contexts. In this article, we present a systematic review of developed and applied RL survey instruments. The study aims to map out the content of the RL survey instruments, synthesize current evidence and propose a future research agenda.

Background Literature
Scholars have introduced several conceptualizations and operationalizations of RL [19][20][21][22]. The following theoretical overview is not exhaustive, but it illustrates the divergent and somewhat overlapping conceptualizations of RL in the literature.
Across several conceptual studies, RL includes a strong emphasis on accountability towards stakeholders both inside and outside the organization. This sets RL apart from leadership focusing on dyadic supervisor-subordinate relationships [18] and acknowledges a wide array of constituencies, such as business partners, customers, supply chain associates and government with legitimate claims on organizational activities [23,24].
Both stakeholder theory [25] and relational perspectives of leadership theory [26,27] have been coupled with RL. The theoretical link between relational leadership and RL lies in the assumption of leadership not being restricted to hierarchical positions or roles but rather occurring in relational dynamics throughout the organization [27]. According to stakeholder theory, it is in the firm's interest to create value for a wide group of stakeholders and not just to the limited stakeholder group of shareholders [25]. Thus, a combination of the two implies relational dynamics between leaders and stakeholders both inside and outside of organizations. Along these lines, RL has been proposed as an umbrella concept to rethink the general notion of leadership in the context of stakeholder theory [28]. While quite a few conceptualizations attribute the embodiment of equal power relations with stakeholders inside and outside the organization to the individual leader (e.g., [16,29,30]), an inclusion of systems such as the contextual environment, the internal environment and the process system into the RL construct has also been proposed [31].
Regardless of this delineation, the act of RL is proposed to directly affect different levels of stakeholder groups at the micro, meso and macro levels [32]. However, the stakeholder construct has many definitions, and what makes up a stakeholder is not comprehensively discussed in the RL literature (e.g., nature as a stakeholder), leaving the range to which the leader is responsible unestablished.
While some theorists have suggested that RL originates from two types of responsible behaviour: "do good" and "avoid harm" [24], others have advocated that differences in mindsets can explain in whom the leader perceives responsibility. Different leader Sustainability 2021, 13, 10298 3 of 40 mindsets are proposed: a limited economic view, and an extended stakeholder view [9,33]. The limited economic mindset is harmonious with shareholder theory [34], upholding that the leader's sole responsibility is towards shareholders and concurring that businesses have no obligations to society other than operating within legal frameworks. Opposed to this, the extended stakeholder mindset corresponds with stakeholder theory [25].
Yet another approach to the construct takes a notion of RL at the very centre of leadership [35], supported by the claim that for a leader not to be responsible is to not be efficient [9]. Efficiency and performance lie at the focal point of leadership research, and according to the theory of responsible leadership for performance (RLP) [17], RL is framed as a performance system of interacting inputs, processes, outputs, feedback and boundaries where each variable has an impact on the others.
The delineation between RL and other leadership constructs is discussed in conceptual studies, with deviating conclusions. Voegtlin [35] suggested RL is related to but different from ethical leadership due to a missing responsibility dimension in ethical leadership. Other theorists have linked RL to leadership styles inclusive of an ethical component such as transformational, servant, ethical or authentic as well as spiritual and emotional leadership. The link between RL and aforementioned leadership styles is described either as belonging to the same family of leadership theories [36] or with RL as an overarching term of said theories [37].
Thus, the field of RL contains a diverse body of literature where conceptualizations vary in many aspects, such as their approaches to leadership (e.g., relational, contextual, leader attributes), the nature of responsibility (e.g., mindsets, behaviours), what the leader is responsible for (e.g., shareholders, customers, the natural environment) and delineation to other leadership constructs (e.g., overarching, related to, or different from transformational and ethical leadership).
Consistent with the depicted streams of RL conceptualizations, operationalizations of RL diverge conjointly while inevitably all claim to measure RL. Whereas the diversity in measurements represents an opportunity for more research [38], a multitude of survey instruments with the same label may cause confusion. For research purposes this is problematic, as constructs with the same label but with different meanings attached to it can cause confusion. In order to build accumulated knowledge, new constructs should be conceptually different [39] and empirically dissimilar from other existing constructs [40]. If not, the consequence could be a confusing body of literature susceptible to several research fallacies, for example, so-called jingle and jangle fallacies [41,42], which entail the assumption that two constructs are the same because they have the same label (jingle fallacy), or that two very similar constructs are distinct because they have different labels (jangle fallacy).
With the intention to build accumulated research, we will review existing survey instruments on RL. Operationalizations' linkage to conceptualizations, what the RL survey instruments are measuring and to what extent they can be used interchangeably are some of the questions attempted answered. While the development of a survey instrument may be influenced by various contextual factors such as research purpose, theoretical backgrounds, source of data and study context, an aggregate of existing survey instruments may go beyond contextual factors and enable the extraction of common elements representing the core of RL operationalizations. Furthermore, we will gather current evidence from applied RL survey instruments mapping out antecedents, outcomes and boundary mechanisms of RL, synthesizing what we know about the RL nomological network.
The next section includes an outline of the five steps we followed for selecting articles and conducting our review ( Figure 1).

Materials and Methods
The systematic literature approach following the PRISMA 2020 statement [43] was adopted to synthesize empirical literature measuring responsible leadership. We qualitatively explored the content of survey instruments on RL, evaluated their psychometric validity, and mapped out the empirical findings to synthesize what we know about the construct. Based on our findings, avenues for future research are proposed. Thus, we aim to answer two main research questions (RQ). RQ1: What is the content of RL survey instruments? RQ2: What are the antecedents and outcomes of RL?

Searches
We started the systematic search by exploring search terms with different synonyms of responsible leadership, like responsible management, accountable leadership and other variants. The search on responsible management gave many hits. Inasmuch as leadership and management are somewhat overlapping yet distinct constructs [44], we found that the literature on responsible management diverged from the literature on responsible leadership, representing separate constructs. The managerial publications address issues on management education (e.g., [45][46][47]), governmental management (e.g., [48][49][50]) and the practice of managing specific contexts, such as models for environmentally responsible supply chains [51], while studies on RL covered topics about personal and relational aspects of leadership such as leader traits, attitudes and behaviour. Therefore, we narrowed our systematic search by using the term "responsible leader*", in the following 6 databases: Web of Science, Scopus, Business Source Complete, Academic Search Premier, SocIndex and PsychInfo. We searched within title, abstract and keywords. The databases were selected by reason of their relevance to the topic, as well as having identical possibilities for search criteria. Hence, databases without the possibil-

Materials and Methods
The systematic literature approach following the PRISMA 2020 statement [43] was adopted to synthesize empirical literature measuring responsible leadership. We qualitatively explored the content of survey instruments on RL, evaluated their psychometric validity, and mapped out the empirical findings to synthesize what we know about the construct. Based on our findings, avenues for future research are proposed. Thus, we aim to answer two main research questions (RQ). RQ1: What is the content of RL survey instruments? RQ2: What are the antecedents and outcomes of RL?

Searches
We started the systematic search by exploring search terms with different synonyms of responsible leadership, like responsible management, accountable leadership and other variants. The search on responsible management gave many hits. Inasmuch as leadership and management are somewhat overlapping yet distinct constructs [44], we found that the literature on responsible management diverged from the literature on responsible leadership, representing separate constructs. The managerial publications address issues on management education (e.g., [45][46][47]), governmental management (e.g., [48][49][50]) and the practice of managing specific contexts, such as models for environmentally responsible supply chains [51], while studies on RL covered topics about personal and relational aspects of leadership such as leader traits, attitudes and behaviour. Therefore, we narrowed our systematic search by using the term "responsible leader*", in the following 6 databases: Web of Science, Scopus, Business Source Complete, Academic Search Premier, SocIndex and PsychInfo. We searched within title, abstract and keywords. The databases were selected by reason of their relevance to the topic, as well as having identical possibilities for search criteria. Hence, databases without the possibility of narrowing the search to abstract or keywords (e.g., JSTOR, and Springer) were excluded from the selection of databases. Our final update of the database search was undertaken on 8 January 2021.

Inclusion and Exclusion Criteria
Papers matching all the following criteria were included: (1) published in the English language, (2) empirically measuring responsible leadership in private sector businesses, (3) peer-reviewed journal articles, (4) sampling leaders in organizations or other stakeholders including employees and (5) all publications up until December 2020. No limitations were made on connecting words to responsible leadership such as socially responsible leadership or globally responsible leadership.
Papers were excluded for the following reasons: (1) student populations without any specifications of work experience made up the sample; (2) papers not published in peer reviewed journals such as conference papers, thesis dissertations, reviews, conceptual and qualitative studies; (3) studies where the survey instruments were not published nor attainable after contacting authors or owners of license; and (4) studies on responsible leadership in non-profit contexts, such as leadership in public sector organizations and political leadership.
Two researchers independently reviewed titles, keywords, and abstracts of the first 200 records and discussed inconsistencies until consensus was obtained. Then, in pairs the researchers screened all studies retrieved. Next, one researcher screened full-text articles for inclusion. In case of doubt, a second researcher was consulted to make a final decision. Studies on student populations without reported work experience were excluded in view of targeting learning outcomes from leadership courses rather than measuring RL practice (e.g., studies applying the social responsible leadership scale, which is designed for educational purposes [52]).
Constructing a clear-cut line between private sector businesses and other types of organizations may oversimplify the plethora of different organizations where RL could have relevance. However, by limiting the scope to private sector businesses, our intent is to pinpoint the complexity of businesses' responsibility towards financial growth and to society. Therefore, studies on RL in institutions such as public hospitals where serving society is the main objective were excluded (e.g., [53,54]).

Data Abstraction and Synthesis
The initial search gave 1313 hits ( Figure 2). After duplicates were removed, 784 articles remained. Screening for the inclusion and exclusion criteria, 30 publications were left. Two of these were excluded after a full-text assessment because they did not meet all inclusion criteria. Twenty-eight publications made up the final sample of studies (see Appendix A for a summary of all included studies). In line with the methodology following Latif and Sajjad [55], the articles were reviewed and analysed for instrument content (see Sections 3.1 and 3.2), psychometric strength (see Section 3.3), descriptive information and relevant outcomes (see Sections 3.4 and 3.5).
Instrument content was described in detail for all scales included in the study (see Results, Table 1). An explorative content analysis of single items was undertaken following a conventional approach as suggested by Hsieh and Shannon [56]. We extracted all items from the included survey instruments, generating a table containing the pool of RL items. By exploring the content of all items, themes were generated and applied to the table of analysis. Based on the detection of clustered themes across survey instruments, we propose core aspects of the construct representative of the whole body of RL measurements. The full pool of items from our analysis are not enclosed in this paper because of certain licenced survey instruments. However, example items from all survey instruments and core aspects are included in the Results chapter, Table 2. The analysis also allowed us to recognize if any of the RL survey instruments stand out by representing either all aspects or none of the aspects.
To evaluate the psychometric strength of each survey instrument, a comprehensive review of the survey's performance was conducted with respect to internal consistency, content validity, convergent validity and discriminant validity (see Results, Table 1). Furthermore, we critically appraised the survey instruments at item level. The descriptive information included study contexts, year of publication (see Results, Figure 3), sample characteristics, utilized theoretical framework, research method, the strength of correlations between variables and hypothesis testing outcomes (see Appendix A). An integrative overview of the empirical evidence was synthesized based on Voegtlin et al.'s [32] multilevel framework of RL outcomes (see Results, Table 3, and Figure 4). According to the framework, RL outcomes can be measured at three different levels. The micro-level includes personal interactions inside the organization, such as leader-follower relations as well as employee outcomes. The meso-level contains organizational culture and business performance. The macro-level focuses on relations to external stakeholders [32]. We suggest that this framework could be mirrored and could include antecedents of RL. Furthermore, we added a new level to the framework with the label 'intrapersonal-level' giving room for variables focusing on the individual leader and their values, orientations and attitudes, which could also be explored as both antecedents and outcomes of RL. The mapping of correlation patterns informs conceptual clarity of the RL construct through its observable manifestations.
to recognize if any of the RL survey instruments stand out by representing either all aspects or none of the aspects.
To evaluate the psychometric strength of each survey instrument, a comprehensive review of the survey's performance was conducted with respect to internal consistency, content validity, convergent validity and discriminant validity (see Results, Table 1). Furthermore, we critically appraised the survey instruments at item level.
The descriptive information included study contexts, year of publication (see Results, Figure 3), sample characteristics, utilized theoretical framework, research method, the strength of correlations between variables and hypothesis testing outcomes (see Appendix A). An integrative overview of the empirical evidence was synthesized based on Voegtlin et al.'s [32] multilevel framework of RL outcomes (see Results, Table 3, and Figure 4). According to the framework, RL outcomes can be measured at three different levels. The micro-level includes personal interactions inside the organization, such as leaderfollower relations as well as employee outcomes. The meso-level contains organizational culture and business performance. The macro-level focuses on relations to external stakeholders [32]. We suggest that this framework could be mirrored and could include antecedents of RL. Furthermore, we added a new level to the framework with the label 'intrapersonal-level' giving room for variables focusing on the individual leader and their values, orientations and attitudes, which could also be explored as both antecedents and outcomes of RL. The mapping of correlation patterns informs conceptual clarity of the RL construct through its observable manifestations.

Results
The results chapter is structured as follows; first we present the results from our mapping (Section 3.1) and analysis of instrument content (Section 3.2). Second, we present the results from our evaluation of psychometric strength (Section 3.3). Thereafter, descriptive information and relevant outcomes from all included studies are laid out (Sections 3.4 and 3.5).

Survey Instrument Content
The sample (n = 28) comprises nine unique survey instruments measuring RL (Table 1). While most scales share the same name of RL, they build on somewhat different conceptualizations and are operationalized thereafter. Furthermore, the survey instruments are developed in various contexts that might influence their design. In the following, we will describe the content of all instruments, their theoretical backgrounds, and the contexts from which they originate. Application of the different measures are presented in the section on descriptive information of selected studies. The measures are presented alphabetically by author below.
The multidimensional measure of RL by Agarwal and Bhal (2020) focuses on RL as ethical and strategic leadership. The operationalization is carried out across industries in India and builds on a review of literature on RL behaviours. Based on their review, Agarwal and Bhal (2020) defined RL as 'a phenomenon in which a leader aims at achieving sustainable organizational growth through development of positive stakeholder interactions and promotion of ethical behaviours' [57]. The survey instrument includes items from the Ethical Leadership Questionnaire [58] in addition to items developed by Agarwal and Bhal [57]. Across their dimensions of ethics and strategy, there are four subscales: moral person, moral manager, multistakeholder consideration and sustainable growth focus. A total of 18 items make up the survey instrument.
The RL survey instrument by Doh et al. [59] describes RL as an art and ability of building and sustaining trustful relationships with stakeholders inside and outside the organization and coordinating responsible action to a meaningful shared business vision, in line with Maak [29]. The survey instrument was developed across industries in India and includes 13 items across the three subscales: stakeholder culture, HR practices and managerial support. The authors did not report the specific theoretical background of their RL operationalization, but they did report on all items being either standard items used in previous research or created items intended to become part of composite scales [59].
The survey instrument on RL orientations by Javed, Akhtar, et al. [60] builds on the conceptualization of RL within different mindsets, which regulates in whom one perceives responsibility: a limited economic view and an extended stakeholder view as suggested by previous researchers [9,33]. A total of 18 items makes up the four subscales that represent different RL orientations: traditional economist, opportunity seeker, integrator and idealist. At one side of the spectrum, leaders focus on financial value creation for shareholders, and on the other side of the spectrum, leaders focus on value creation for a wider group of stakeholders. The survey instrument was developed across industries in Pakistan.
The RL survey instrument by Lips-Wiersma et al. [61] is an operationalization of RL as an overarching term for the inclusion of ethical and moral aspects in leadership, in line with Antunes and Franco's [37] conceptualization. RL is measured as a composite of four different leadership styles: authentic, transformational, ethical and shared leadership. The survey instrument is unidimensional and contains four items that are descriptions of the abovementioned leadership styles. The survey instrument was developed across industries in the United States. Builds on Maak [29]: 'The art and ability involved in building, cultivating and sustaining trustful relationships to different stakeholders, both inside and outside the organization, and in co-ordinating responsible action to achieve a meaningful, commonly shared business vision.' (1) Stakeholder culture, ' . . . the process of building and sustaining positive relationships with both internal and external stakeholders to the organization.' [18].
(1) Stakeholder culture, ' . . . the art and ability involved in building, cultivating and sustaining trustful relationships to different stakeholders, both inside and outside the organization, and in co-ordinating responsible action to achieve a meaningful, commonly shared business vision.' [29] (p. 334)  'A social and ethical phenomenon that occurs in the process of social interactions.' [65] (1) Stakeholder culture,  ' . . . responsibility is a subjective phenomenon and considerably depends on the leader. To whom a business leader is responsible, and for what he is responsible is a person-specific occurrence.' [60] (1) Traditional economist, Item generation based on a conceptual study [33]. English language applied in Pakistan.
Full scale not reported. AVE for subscales (1) 0.82, Adaption of previously validated and implemented survey items, as well as items from online surveys. Expert reviews and pre-testing. English language in Germany.

Not reported
Not reported Not reported 1 = agree 2 = disagree
(1) Expert, (2) Facilitator, Citizen 28 items. Items in expert and facilitator dimensions adopted from LBDQ XII [75]. Items in citizen dimension adopted from servant leadership scales [76,77] 'Tries out new ideas with the group'; Treats all group members as his equals' Adaption of previously validated and implemented survey items. Back-translation from English to German in Switzerland.

Not reported
Full scale not reported. Voegtlin, 2011 [78] 'Responsible leadership refers to the awareness and consideration of the consequences of one's actions for all stakeholders, as well as the exertion of influence by enabling the involvement of the affected stakeholders and by engaging in an active stakeholder dialogue. There in responsible leaders strive to weigh and balance the interests of the forwarded claims.' [78] (p. 59).
(1) The frequency of interaction with different stakeholder groups, 'the art and ability involved in building, cultivating and sustaining trustful relationships to different stakeholders, both inside and outside the organizations, and in coordinating responsible action to achieve a meaningful, commonly shred business vision.' [29] (2) Discursive RL Only subscale (2)  'the awareness and consideration of the consequences of one's actions for all stakeholders, as well as the exertion of influence by enabling the involvement of the affected stakeholders and by engaging in an active stakeholder dialogue. There in responsible leaders strive to weigh and balance the interests of the forwarded claims.' [78]  ' . . . a relational and ethical phenomenon, which occurs in social processes of interaction with those who affect or are affected by leadership and have a stake in the purpose and vision of the leadership relationship.' [18] (p. 103) Application of previously validated survey items. Back-translation procedure reported.

Not reported
Low-level leaders α = 0.85; High-level leaders α = 0.92 CFA confirmed one factor ' . . . a relational and ethical phenomenon that occurs in the social interaction process.' [18] (2) Discursive RL Only subscale (2)  '...leadership that emphasizes the firm's sustainable development and embraces social responsibility.' [37] (2) Discursive RL Only subscale (2)   ' . . . a leadership style where a leader acts as a weaver of stakeholder relationships and responds to both existing gaps in theory and practical leadership challenges.' [18] (2) Discursive RL Only subscale (2)  The RL survey instrument by Liu and Lin [67] describes RL as a type of value-centred leadership emphasizing the leader's ability to enable positivity through interpersonal interaction [67]. Building on a chapter on skills and strategies of RL [69] in the Handbook of Responsible Leadership [92], the items of the survey instrument reflect different traits of positivity central to responsible leadership, as outlined by Cameron and Caza [69], such as positive communication and positive connection. The survey instrument is unidimensional, and it contains six items. It is developed in the context of the high-tech industry in Taiwan.
The Competency Assessment for RL (CARL) by Muff et al. [71] defines the responsible leader as someone who ' . . . demonstrates a deep understanding of the system and the own person, is distinguished by an ethical and values-based attitude, and able to build longterm relations with different stakeholders embracing their needs, while initiating change towards sustainable development' [71]. The definition is based on their review of the theoretical development of RL. The survey instrument is also an online assessment tool that links RL to the sustainable developmental goals The Value Based RL scale (VBRL) by Saini [72] is another survey instrument on RL developed in the context of Indian businesses. The VBRL is set to measure top management's righteousness, compassion and concern and is based on a review of different leadership literatures, such as virtuous leadership. Like Agarwal and Bhal's [57] multidimensional measure of RL, the VBRL scale also includes items from the Ethical Leadership Questionnaire [58]. Moreover, items from the Ethical Climate Questionnaire [73] are included as well as items developed by Saini [72]. The survey instrument is made up of 20 items within four subscales that characterize the responsible leader: empathetic, value-oriented, responsible and nurturing.
The RL survey instrument by Voegtlin et al. [74] defined RL as 'Leader's behaviour oriented toward the fulfilment of organizational tasks, the needs of employees and the needs for society simultaneously and over time. Leaders assume responsibility . . . in their roles as expert, facilitator and citizen'. Their understanding of RL builds on theories of behavioural complexity and stakeholder theory. The theory of behavioural complexity [93] argues for the benefit of both/and-behaviour above either/or-strategies and assumes that leaders can both conceive and perform multiple and contradictory roles. Their defined three roles of RL make up the subscales in the measure: an expert with organizational expertise, a facilitator that motivates and cares for employees, and a citizen who considers consequences for society. Items in the survey instrument were adopted from the Leader Behavior Description Questionnaire (LBDQ) [75] for the roles of expert and facilitator and two scales of servant leadership [76,77] for the role of citizen. The scale was developed in a Swiss business context.
The Discursive RL survey instrument by Voegtlin [78] measures one facet of RL only, defined as 'the awareness and consideration of the consequences of one's actions for all stakeholders, as well as the exertion of influence by enabling the involvement of the affected stakeholders, and by engaging in an active stakeholder dialogue' [78]. Theoretically, the survey instrument is based on the steps of discursive conflict resolution [94], where the ideal discourse requires all affected persons to have equal chances to participate in the discourse in a condition of symmetrical power relations. The survey instrument includes 16 items across two subscales, where the first subscale asks about the leader's relationship to different stakeholder groups, while the other is a scale on discursive RL. The scale was developed in a combination of Swiss and German business contexts.

Units of Analysis in RL Survey Instruments
Several of the survey instruments (n = 5) were designed as subordinate reports about their direct supervisors. One of these is focused on leaders in top management, with middle managers as the respondents [72]. Some of the survey instruments were designed as subordinate reports with a wider unit of analysis (n = 2). One of these focuses on RL of all leaders in the organization rather than direct supervisors [61], while another measures RL of the direct leader and the organization in combination [59]. There are also survey instruments on RL designed as self-reports for leaders (n = 2) [60,67].

Response Scales in RL Survey instruments
The most frequently applied response scale in the survey instruments was a 5-point Likert response scale. The exceptions were two instruments with a 7-point Likert response scale [59,60], one instrument with a 4-point scale [72] and another with a dichotomous response option [71]. None of the survey instruments included a non-response option.

Analysis of Survey Instruments: RL Core Aspects
While the survey instruments contained various RL aspects as described above, there are many similarities between operationalizations. Based on our content analysis of the pool of RL items (n = 177), we found four clusters of themes, which we will refer to as the RL core aspects. The four core aspects are labelled with descriptions of what responsible leadership is and what a responsible leader does (see example items, Table 2).
Accountable role model: This aspect includes items describing the leader's accountability towards others, self-awareness and their acting as a role model for stakeholders, mostly for employees. Example items are 'Shows integrity and honesty in his actions', 'Responds fairly to complaints and concerns' [72]; 'Takes ownership for own actions' [57]; 'He makes his attitudes clear to the group' [74]; and 'My immediate manager leads by example' [59]. Items adapted from ethical leadership scales fit within this aspect. All survey instruments except the Discursive RL scale [78] had items represented in this core aspect.
Inclusive facilitator: Another aspect found across the survey instruments was including the appreciation of diversity, equal power relations, generous and broad involvement and the ability to take on different perspectives. Sample items are 'Conflict can be the basis for creativity'; 'When looking for solutions I integrate insights from diverse disciplines' [71]; 'Listens to what subordinates have to say' [57]; 'He tries out new ideas with the group'; 'He treats all group members as his equals' [74]; and 'Involves the affected stakeholders in the decision making process' [78].
The scale on discursive RL by Voegtlin [78] almost exclusively falls within this core aspect, while the survey instrument by Liu and Lin [67] does not have any items fitting within this category.
Pro-active planner: This aspect includes items describing mindsets of long-term planning as well as a welcoming attitude of change and innovation. Some items are limited to the long-term planning of the organization, while other items include external stakeholders and the macro-level. Example items are 'Shows concern for availability or conservation of resources (e.g., natural resources) when planning for future business demands' [57]; 'I create long-term value for a number of stakeholders' [60]; 'Transformational leadership: Leader behaviours that transform and inspire employees to perform beyond expectations, push employees to develop innovative strategies while going beyond self-interest for the good of the organization' [61]; 'When making decisions, one should also consider future generations' [71]; and 'My supervisor is preparing the organization to make a positive difference in the future' [74]. Although many items fell within this core aspect, three survey instruments were not represented [59,72,78].
Benevolent value creator: This aspect includes items representing a wide perception of value creation (e.g., creation of knowledge and social value) and a concern for others' welfare. Sample items are 'Considers stakeholder well-being as important business outcome' [57]; 'Our organization believes all employees deserve to be actively managed as talent' [59]; 'I try to serve my stakeholders' [60]; 'I am concerned about employee emotion' [67]; 'The welfare of people and nature is important to me'; 'It is important to me to find solutions to problems that are relevant to society [71]; and 'Concerned about the customers and public interest' [72]. Only one survey instrument did not include items from this core aspect [61].
Out of the 177 items, a few (n = 18) did not fit in any of the four thematic clusters. These describe more general leadership behaviours unspecific to RL. Example items are 'My immediate manager is effective' [59]; 'Even in an emergency I wouldn't feel like panicking' [71]; and 'I have minimal use for cost benefit analysis' [60]. On the other hand, some items matched several themes and could thus belong to more than one core aspect. Example items are 'Encourages my personal and professional development' [72] which could belong to both 'benevolent value creator', and 'pro-active planner', and 'I try to empower stakeholders' [60] which matches both the 'inclusive facilitator' and 'benevolent value creator' aspects. In these cases, two researchers discussed and reached consensus on which aspects the items should be ascribed to.

Psychometric Validity of RL Survey Instruments
The survey instruments explored in the present study were reviewed for four psychometric properties: (1) internal consistency, (2) content validity, (3) convergent validity, and (4) discriminant validity. The psychometric properties of the included survey instruments are shown in Table 1. Of the 28 reviewed articles, 16 articles reported all four psychometric properties.
Overall, the RL survey instruments showed acceptable to good reliability (α = 0.79-96). Only one study did not report internal consistency for the survey instrument, while 3 stud-ies reported reliability for the subscales only and failed to report on internal consistency for the full scale.
All studies reported on how the content validity was established (e.g., open-ended feedback from academic experts, adaption of previously validated survey items). However, just 4 out of the 9 scale development studies reported multiple procedures to ensure content validity. Moreover, matters of language was not addressed in 10 of the included studies (e.g., back-translation procedures, the application of English survey instruments in Asian business contexts). Our critical appraisal of survey items showed that some RL measurements may be prone to certain biases.
In the self-reporting survey instruments on RL, many of the items are loaded with a positive or negative meaning attached to them, which makes them latent for social desirability bias. Items such as 'I would not accept a bribe, even if it were very large' are not neutral items, and they raise questions if anyone responds negatively.
The lack of a non-response option in questionnaires where respondents are asked to report on their perception of someone else's behaviour, attitude or value may cause bias because of an underlying assumption that all employees possess this information about their supervisor. Items like 'Please indicate how often your supervisor interacts with local community representatives' [78] represents this problem.
Considering face validity, there is a risk for nuances getting lost in ambiguous questions like 'My direct supervisor demonstrates awareness of the relevant stakeholder claims'. There is a chain of assumptions (e.g., how is awareness demonstrated, what is a relevant stakeholder claim, which stakeholder is in question) included in the interpretation of the item, which could induce measuring error Within the sampled articles, a total of 12 studies failed to report convergent validity. Reported convergent validity for RL survey instruments varied across studies, where 11 studies reported on average variance extracted (AVE = 0. , and seven studies reported factor analytic evidence of convergent validity (CFA > 0.5).
Discriminant validity was reported by confirmatory factor analysis in 24 studies. Eleven studies reported on additional evidence of discriminant validity (e.g., HTMT index, AVE greater than paired correlations). Three studies did not report discriminant validity. Only two studies reported on discriminant validity between RL and associated leadership constructs such as servant, ethical and transformational leadership.
Thus, the earliest quantitative studies on RL were published in 2011 [59,78]. However, only recently has there been a substantial increase in studies both designing new survey instruments and applying existing ones. Out of the 28 studies included, 24 were published during the two last years. This could indicate the start of a tendency of increased interest in measuring RL, suggesting more of these studies in coming years.
The geographical scope of the included studies is presented in Figure 3. Out of the 28 papers, 20 are from Asian countries. There are five studies from Europe, three from Oceania and one from North America.

Sample Characteristics in Selected Studies
A total of nine studies sampled respondents across industries (Figure 3), while the remaining studies limited their scope to specific industries such as hotels, manufacturing firms, the banking sector, insurance companies and sales industries. Responses are mainly subordinates' reports of their direct leader (18) or leaders in general in their organization (3). However, a few of the selected studies included self-reports from respondents at top management and/or middle management (4), as well as middle-management respondents reporting on their CEO (3). The sample sizes ranged from n = 89 to n = 4352, while most sample sizes included 100 to 400 respondents. Only four studies included more than 400 respondents. (See Table 2 for an extensive overview.) ania and one from North America.

Sample Characteristics in Selected Studies
A total of nine studies sampled respondents across industries (Figure 3), while the remaining studies limited their scope to specific industries such as hotels, manufacturing firms, the banking sector, insurance companies and sales industries. Responses are mainly subordinates' reports of their direct leader (18) or leaders in general in their organization (3). However, a few of the selected studies included self-reports from respondents at top management and/or middle management (4), as well as middle-management respondents reporting on their CEO (3). The sample sizes ranged from n = 89 to n = 4352, while most sample sizes included 100 to 400 respondents. Only four studies included more than 400 respondents. (See Table 2 for an extensive overview.)

Theoretical Frameworks in Selected Studies
Various theoretical frameworks were applied in the selected papers. However, seven papers in the sample did not report their theoretical framework (Appendix A).
Several studies applied social psychological frameworks such as social exchange theory [70], social identity theory (SIT) [91,99,100], social identity theory of leadership (SITL) [64] and social cognitive theory [81], which all put an emphasis on social mechanisms

Theoretical Frameworks in Selected Studies
Various theoretical frameworks were applied in the selected papers. However, seven papers in the sample did not report their theoretical framework (Appendix A).
Several studies applied social psychological frameworks such as social exchange theory [70], social identity theory (SIT) [91,99,100], social identity theory of leadership (SITL) [64] and social cognitive theory [81], which all put an emphasis on social mechanisms between individuals or between members in groups. Role theory was applied in a few of the studies [67,79], much in line with the conceptualization of the roles model of responsible leadership [18], where the responsible leader has different roles to play according to different contexts.
In addition to psychological frameworks, some of the studies took on business theoretical perspectives. Considering the close theoretical tie between RL and stakeholder theory it is no surprise that several studies applied this theoretical framework [74,83,85], which includes a strategic perspective lacking from social psychological approaches. Upper echelons theory was also applied in the empirical studies [60] and centred on leaders' personal experiences, attitudes and values and how it influences decision-making within the boundaries of organizational decision latitude. Some studies combined the abovementioned psychological and business frameworks, including social identity theory and stakeholder theory (see Appendix A for an extensive overview).

Research Methods and Designs in Selected Studies
Out of the 28 studies, five applied mixed method, 22 were cross sectional surveys, and one was longitudinal (Appendix A). There was one experiment within the mixed method studies. Although many of the cross-sectional studies operated with independent and dependent variables by applying SEM analysis, in this paper we will only refer to causal relationships when they are reported in longitudinal or experimental studies. In all other cases, we will refer to correlations.

Applied RL Survey Instruments in Selected Studies
The discursive RL survey instrument by Voegtlin [78] was applied in 14 of the included studies and was by far the most-used RL survey instrument (see Table 1). Although the measure includes two subscales and 16 items in its original design, nearly all publications applied just one of the subscales, which includes five items. Many studies did not report this exclusion of items from the measure. Furthermore, in most studies, the survey instrument was referred to as a measure of RL rather than a measure of discursive RL, which Voegtlin [78] described as an aspect of RL rather than the full construct. The scale originates from Swiss and German contexts and was applied in nearly all the included studies from Pakistan and China as well as a study from Spain.
The RL survey instrument by Doh et al. [59] was applied in five of the included publications. Originating from a study in India, the survey instrument was later applied in the Australian and Turkish contexts. Some applied studies used the survey instrument in its original form, while in other studies performed lingual adjustments to improve the face validity of items.
The RL survey instrument by Liu and Lin [67] was applied in two of the included studies and both originated and was applied in Taiwanese contexts. While the original form of the survey instrument was designed as a self-report measure for leaders, the wordings were adjusted to subordinate respondents' perceptions of their direct supervisor in the applied study.

Control Variables
All studies sampled respondents from members of the organization, supervisors or subordinates. Thus, many of the same control variables were found across the studies, such as organizational tenure or the years employees had worked together with their supervisors, or work experience in general. Level of education, level of income, gender, age and position in the organization were also included. Some studies also included controls such as religion and marital status. Size of the organization and number of employees under the same supervisor was also controlled for in some studies.

Integrative Overview of Empirical Evidence: Correlation Patterns
Correlation patterns are categorized according to four levels of antecedents and outcomes we consider relevant for RL, as described in the methodology section: intrapersonal, micro, meso and macro-levels (Table 3 and Figure 4). A complete overview of correlation sizes and effects from statistical hypothesis testing is included in the summary of studies, Appendix A.  Note, (+) = Positive relationship, (-) = Negative relationship, (/) = No relationship.

Intrapersonal Level
The intrapersonal level includes variables ascribed to the individual leader, such as personal traits, personal values and attitudes. There is limited evidence at the intrapersonal level, as only one of the included publications explored these relationships [74]. Empathy, positive affect and universal values were found to facilitate RL, while no relationship was detected between holistic thinking and RL. There was no exploration of outcomes or mediating and moderating variables at the intrapersonal level (Table 3 and Figure 4).

Micro-Level
The micro-level includes antecedents and outcomes inside organizations between members and within employees. Due to a lack of research on antecedents at the microlevel, all following evidence investigated the predictive value of RL through direct or indirect effects.
At the micro-level, we observed three clusters of research, which partly overlap. The first cluster centred on RL and employee turnover, the second on RL and employee organizational commitment. The third cluster concentrated on RL and responsible employee behaviours, with various mediation and moderation variables. In addition to these clusters, a few studies addressed RL and leader outcomes, and RL's effect on various classical organizational measures, such as job performance, well-being and creativity.
Across several studies, RL showed negative associations with turnover intentions [59,62,63,67,89] as well as turnover rates [59]. The number of unique studies on RL and turnover intentions in addition to turnover rates from a large scale study (n = 4352) [59] gives considerable evidence of the relationship. It should be noted that several of the studies on RL and turnover intentions applied the RL survey instrument by Doh et al. [59], which emphasised organizational responsibility for employees, and especially on talent retention. Thus, a high score on RL implies organizational long-term investment in employees, which in turn reasonably influenced turnover. Several mediators between RL and turnover intentions were found in single studies, but none of these were tested in multiple studies, leaving the boundary mechanisms somewhat unestablished (Table 3).
RL and employee organizational commitment showed a positive direct relationship in four studies. Both employee organizational commitment in general [62], and subunits such as affective commitment [74] and normative commitment [64], showed positive relations to RL. This also holds for RL at the executive level and its influence on affective commitment for mid-level managers [72]. However, organizational commitment was also suggested as a mediator between RL and micro-level variables in two studies [63,99], leaving the nature of these relationships somewhat unclear. Although a number of studies have a positive say regarding the relationship between RL and organizational commitment, it is plausible that three out of the six studies originated from the same sample and may not be considered as three independent studies.
Several responsible employee behaviours directed towards (meso-and) macro-level factors such as the natural environment [82,83,90,91,99], and towards external stakeholders [57,74] have positive associations with RL. The same goes for responsibility principles such as moral courage and whistle blowing intentions [79], implying that responsible leaders influence responsible employee intentions and behaviour. Furthermore, RL and different kinds of employee unethical behaviour [78,81] showed negative relationships, supporting the findings of RL and its positive influence on responsible employee behaviour.
Although most studies on RL and variables at the micro-level focused on employee outcomes, a few investigated RL and leader outcomes. RL is positively associated with (employees' perceptions of) leaders' effectiveness [74], and employees' satisfaction with their leader [72]. High-level leader RL had a positive association with low-level leader RL, suggesting trickle down effects of RL in the organization [81].
Quite a few studies explored the mediating effects between RL and micro-level outcomes. Several mediators comprised certain kinds of recognition between (employee-) self and leader or organization, such as value congruence between employees and leaders at different levels [81], leader identification [90], organizational identification [67,80], and person-organization fit [79]. This suggests that the performance of RL resonates more with employees who share values with their organization and supervisor, or in other manners identify with the responsibility of RL.
Moderation effects between RL and microlevel variables were tested through various single studies. Some of these are linked to relational mechanisms, such as supervisorsubordinate guanxi [83] and the frequency of interaction between supervisor and subordinates [78]. Other moderators pertain to employees' perceptions of their role and their ability to make an impact on matters of social and environmental responsibility through moderators, such as locus of control [99], and the perceived role of ethics and social responsibility (PRESOR) [90].

Meso-Level
The meso-level includes group-level organizational variables such as organizational culture, climate and business performance. At the meso-level, we find research on outcomes but not on antecedents of RL. Furthermore, three out of the six studies investigating mesolevel outcomes did not use RL as a predictor but as a moderator.
RL is positively associated with corporate reputation [86], corporate sustainability performance, and financial [85,86], environmental, and social performance [85]. The positive relations between RL and all units of the triple bottom line suggest that RL does not induce trade-offs between financial and environmental performance.
The relationship between RL and meso-level variables are mediated by employee involvement in sustainability activities [87], innovation and (employees' perceived) corporate reputation [85].
RL has direct effects on ethical climate [66] as well as previously mentioned indirect effects on micro-level variables through the mediation of ethical climate [89]. The relationship between these variables needs further investigation.
RL positively moderates the positive relationship between stakeholder pressure and corporate environmental ethics [88]. Different orientations of RL have been found to have varying moderating effects: two orientations of RL with a wide stakeholder focus positively moderated the relationship between responsible governance and CSR, while the two orientations with a narrow stakeholder focus negatively moderated the same relationship [60]. RL negatively moderates the relationship between CSR and corporate reputation and between CSR and financial performance [86]. Authors have suggested that an overemphasis on stakeholder concerns and CSR could decrease financial performance. These findings are contrary to other findings in the sample where RL positively affects the triple bottom line.

Macro-Level
Macro-level variables represent all factors external to the organization, such as customers, governments, actors in the supply chain and the natural environment. As we have seen throughout the content analysis of RL survey instruments, the macro-level representing all external stakeholders is imperative to RL. Still, only one of the included studies investigated RL and macro-level variables, leaving this level largely unexplored. The one study found that RL positively influenced stakeholders' perceptions of leader attractiveness, and organizational attractiveness [74]. seen throughout the content analysis of RL survey instruments, the macro-level representing all external stakeholders is imperative to RL. Still, only one of the included studies investigated RL and macro-level variables, leaving this level largely unexplored. The one study found that RL positively influenced stakeholders' perceptions of leader attractiveness, and organizational attractiveness [74].

Discussion
Through our systematic review of studies measuring RL, aiming to designate the core of RL survey instruments, we found diverging definitions and various dimensions in the selected survey instruments. As a result of an explorative content analysis of all RL items, four clusters across survey instruments were found: accountable role model, inclusive facilitator, pro-active planner and benevolent value creator. We propose these as the four core RL aspects that may contribute to the delineation of what RL is and what it is not. The evaluation of survey instruments' psychometric validity indicated that only two survey instruments included adequate evidence of incremental validity by reporting on discriminant validity correlated with ethical, servant or authentic leadership. The weak evidence raises questions about the unique contribution of RL.
With the aim of mapping antecedents and outcomes of RL, our review of the empirical evidence shows a lack of research on antecedents and great focus on outcomes. A compilation of the researched outcomes shows prospects of RL's ability to facilitate responsible employee behaviours directed toward the organization, the environment and society and that the influence of RL is stronger when employees feel committed to the organization, for instance by sharing values or identifying themselves with the organization or the leader (Figure 3). Despite the aggregation of correlation patterns in our review, replication studies are scarce, leaving connections between RL and its suggested outcomes unsettled. Furthermore, research designs establishing causality are more or less lacking, leaving the directions of relationships uncertain.
The geographical overview and publication timeline showed a rapid increase in quantitative studies on RL during the past three years, where Asian countries are most prolific. This paints a very different picture than previous reviews (e.g., [19,20,22]) and may indicate a shift in the field. A bibliometric review including all published studies (quantitative and qualitative) on RL between the years 2006 and 2016 [19] found only a

Discussion
Through our systematic review of studies measuring RL, aiming to designate the core of RL survey instruments, we found diverging definitions and various dimensions in the selected survey instruments. As a result of an explorative content analysis of all RL items, four clusters across survey instruments were found: accountable role model, inclusive facilitator, pro-active planner and benevolent value creator. We propose these as the four core RL aspects that may contribute to the delineation of what RL is and what it is not. The evaluation of survey instruments' psychometric validity indicated that only two survey instruments included adequate evidence of incremental validity by reporting on discriminant validity correlated with ethical, servant or authentic leadership. The weak evidence raises questions about the unique contribution of RL.
With the aim of mapping antecedents and outcomes of RL, our review of the empirical evidence shows a lack of research on antecedents and great focus on outcomes. A compilation of the researched outcomes shows prospects of RL's ability to facilitate responsible employee behaviours directed toward the organization, the environment and society and that the influence of RL is stronger when employees feel committed to the organization, for instance by sharing values or identifying themselves with the organization or the leader (Figure 3). Despite the aggregation of correlation patterns in our review, replication studies are scarce, leaving connections between RL and its suggested outcomes unsettled. Furthermore, research designs establishing causality are more or less lacking, leaving the directions of relationships uncertain.
The geographical overview and publication timeline showed a rapid increase in quantitative studies on RL during the past three years, where Asian countries are most prolific. This paints a very different picture than previous reviews (e.g., [19,20,22]) and may indicate a shift in the field. A bibliometric review including all published studies (quantitative and qualitative) on RL between the years 2006 and 2016 [19] found only a few quantitative studies. In that period, the US was topping the list of number of publications, followed by European countries like Switzerland, Austria, Spain and France.
The discussion is structured as follows; First we address RQ1 by critically reflecting on the four RL core aspects found through our analysis. Furthermore, based on our evaluation of psychometric validity we briefly comment on RL survey instruments, focusing on item wordings prone to biases. Then we move on to RQ2 and discuss antecedents and outcomes of RL with an emphasis on the meaning of responsibility for stakeholders.

Conceptual Clarity
The growth spurt of quantitative research on RL gives valuable insight on how to measure the construct. However, a parallel development of multiple survey instruments may cause difficulties in building accumulated knowledge and challenge the prospect of incremental validity. Thus, none of the included studies tested incremental validity between different RL survey instruments and evaluated the need for yet another measure of RL, which makes the literature prone to a proliferation of constructs.
Furthermore, the unique contribution of RL compared to related leadership constructs such as ethical and servant leadership remains somewhat unclear. Items from an ethical leadership scale were adopted in two RL survey instruments emphasizing ethics as part of the RL construct [57,72]. Agarwal and Bahl [57] tested discriminant validity with servant and authentic leadership but not with ethical leadership, which in their case represents partially overlapping constructs. Voegtlin et al. [74,78], on the other hand, emphasized the conceptual difference between RL and ethical leadership and reported discriminant validity between discursive RL and ethical and transformational leadership [78]. Moreover, Voegtlin et al. [74] adopted items from servant leadership questionnaires, representing one dimension of their RL survey instrument [76,77], thus proposing servant leadership as partially overlapping with RL, and therefore without tests of discriminant validity.
Consequently, even though RL was tested for discriminant validity with related leadership constructs, this is not consistent across survey instruments, as they are made up of various building blocks that partially overlap with other constructs. Adding to the disarray, the umbrella construct of RL [61] underlined similarities between RL and related leadership constructs by including authentic, transformational, ethical and shared leadership in the RL construct. The researchers in that study found support for a unidimensional scale with acceptable internal consistency, underlining the close relatedness of these constructs.
The width of the RL construct is another aspect of conceptual discord between studies. Empathy and HRM were both included as a dimension in RL survey instruments (see pp. [16][17] [59,72] and were investigated as separate constructs in relation to RL in other studies [74,84]. The implications of a construct such as empathy being a part of RL is imperative because it implies that RL is more than behaviour; it is also behavioural intentions. Further, an inclusion of HRM implies an operational definition in keeping with RL conceptualizations where systems are included in the construct. Currently, all these approaches are represented in RL survey instruments. Items like 'Tries to assess impact on stakeholders before making business decisions' implies a leader with good intentions, while 'I create long term value for a number of stakeholders' could have a number of different motivations (e.g., competitive advantage, care for stakeholders), whereas 'This organization responds well to a diverse group of stakeholders' includes the system. Thus, the conceptual clarity of RL comes across rather fuzzy and needs further attention on several key aspects. Our proposed four RL core aspects are an attempt to centre on the uniqueness of RL and provide a steppingstone towards conceptual clarity.

Characteristics of RL
Across all four RL core aspects, we found differences on who the responsibility is directed towards. Some items describe a leader's responsible principles rather than behaviour or intentions towards stakeholders. The aspect 'accountable role model' includes many items originally adopted from ethical leadership scales and is largely focused on leader accountability and acting as a role model for employees. Social learning lies close to this aspect; however, the extent to which social learning stretches seems limited to members of the organization rather than inclusive of external stakeholders. Accountability, on the other hand, is directed to a wide array of stakeholders.
The core aspect 'inclusive facilitator' incorporates the relational and habermasian aspects of RL. Relational aspects of leadership and habermasian discourse ethics assume equal power relations between actors [27,94] and does not fit well with hierarchical leader-ship systems. Discursive RL falls almost exclusively within this aspect of RL, and theoretical foundations for the aspect is found in several conceptualizing studies (e.g., [30,35,101]). In their conceptual study, Maak and Pless [18] described one of the roles of a responsible leader as a 'weaver' of a network of relations inside and outside the organization. This bridges RL with rising concepts in business sustainability theory and practice, such as co-creation and social innovation, where businesses, governments and citizens cooperate to overcome problems linked to mutual societal challenges, such as climate change, mental health and education. Furthermore, rather than top down or bottom up, one can assume lateral influences, like those described in social networks.
Besides relational aspects, included in this aspect is also the leader's ability to take on perspectives from different stakeholders and include these in business practices like decision-making processes and corporate strategies. Despite the principles of equal power relations in the inclusive facilitator aspect, some RL items go against these principles. Items such as 'disciplines followers who violate organization's ethical standards' suggests that responsible leaders have the privilege of defining what is wrong and what is right rather than jointly sorting it out. In our opinion, this theoretically contradicts a core aspect of RL and should be considered excluded. Still, the emphasis on equal power relations raises questions about how well the map fits the terrain and whether RL is compatible with traditional hierarchical organizations.
The core aspect 'inventive planner' centres on sustainability aspects of leadership, such as liability for future generations. The resourcefulness to look beyond primary stakeholders here and now and plan for long-term prosperity inside and outside organizations are key aspects of RL. While some items focus on long-term prosperity for the organization (e.g., retaining talent, linking present business tasks with long-term organizational goals), others include a wider perspective of constituencies (e.g., create long-term value for stakeholders, show concern for conservation of resources). Included in the aspect is also an ability to question the status quo and adapt business practices to current demands. Even though sustainability aspects are of utmost centrality to present business responsibility (e.g., UN SDGs), this is the RL aspect with the fewest items included, suggesting less emphasis in current measures of RL.
'Benevolent value creator' represents both visible and invisible stakeholders (e.g., supply chain, animal welfare, and hidden modern slavery), who may be overshadowed by other proximate and prominent stakeholders. This aspect concords with CSV [14], where instead of focusing on certifications and CSR systems, many companies include their business responsibility in strategies of shared value creation, often in collaboration with external stakeholders such as welfare programs in local communities, and innovative procurement partnerships with public sector offices that benefit all parties. A wide understanding of what value creation is lies at the core of RL, and while it might seem necessary with trade-offs between profitability and social and environmental responsibility, this could be considered a perception rather than a fact. Our review shows no such trade-offs under responsible leaders, because RL is positively associated with all parts of the triple bottom line. Previous research has also indicated that corporate financial performance and socially responsible performance can concur [102,103], supporting the idea of shared value creation [14]. As Waldman and Siegel [12] observed, shareholders of many firms are increasingly demanding that their firms 'do well by doing good'. Freeman et al. [104] underlined this aspect clearly by stating "Shareholders are stakeholders. Dividing the world into 'shareholder concerns' and 'stakeholders concerns' is roughly the logical equivalent of contrasting 'apples' with 'fruit'" [104].
Stakeholder theory lies in the background of all four RL core aspects, and hence ensures responsibility beyond the boardroom. But what a stakeholder constitutes differs between survey instruments (e.g., narrow or wide inclusion, generic or specific terms, defined or undefined for respondents). While some survey instruments focus on the leaders' ability to make all employees, and hence the organization as a whole, act responsibly towards external stakeholders, other focus on the leaders' direct relation to and care for internal and external stakeholders, such as employees, customers and suppliers. Applying a narrow stakeholder focus, a few survey instruments exclusively addressed the stakeholder group of employees [61,67], which makes them stand out from the other survey instruments by complying to traditional leader-follower measurements. On the other side of the spectrum, Voegtlin's [78] discursive RL scale comprised an extensive list of stakeholders (it should be noted that this list comprised one dimension of the discursive RL survey instrument that is excluded from most studies applying the measure. See 'Applied RL survey instruments, p. 24), granting equal focus to all of them. Stakeholder specificity also varies across survey instruments. By simply using the word 'stakeholder' or 'stakeholder group', ambiguity could be induced, leaving open for interpretation which stakeholder is in question or eliminating the possibility to differentiate between stakeholders. A few measures counter this ambiguity by naming specific stakeholders in items, such as natural resources [105] and customers [72].
One of the most prominent classifications in stakeholder theory is classifying stakeholders as primary or secondary stakeholders [106]. 'Primary stakeholders are those groups without whose continuing participation the corporation cannot survive' [106] (p. 106). These groups include shareholders, employees, customers and other stakeholders related to the economic profitability of the organization. There is usually a high interdependency between the organization and these stakeholders. Secondary stakeholders are those groups who 'influence or affect, or are influenced or affected by the corporation but are not engaged in transactions with the corporation and are not essential for its survival' [106] (p. 107). These include stakeholders such as NGOs and local communities, and they usually represent broader societal concerns. Although the natural environment was not considered a primary stakeholder in the original stakeholder theory, some recent theorists have argued that it should be [107,108] considering its centrality to all businesses. At the World Economic Forum, statements about nature as the most important stakeholder of the current decade have been proclaimed [109]. Despite its centrality to current business practice, others are of the opinion that stakeholder status should not be ascribed to non-human environments, but is still accounted for through legitimate organizational stakeholders [110]. Following this fairness-based approach, stakeholder theory's concordance with environmental responsibility is indirectly maintained not just through environmental NGOs, but through employees, customers and maybe even shareholders.

Responsibility for Stakeholders and the Natural Environment
We have established that most survey instruments on RL measures leadership of the individual direct leader, while a few have a wider focus such as leadership of the organization. This implies that the evidence on RL is largely based on employee reports of middle-level managers. One study found, however, that RL was performed at middle level when performed at the top-level, suggesting a trickle-down effect. But this study was crosssectional, limiting the possibility to draw conclusions about causality. Considering the nature of RL, implicating equal power relations and lateral influence (rather than top-down) inside and outside of the organization, it is plausible that social influence also happens through trickle-up, trickle-around, trickle-in and trickle-out effects [111] where leaders are affected by stakeholders such as employees, customers and competitors. External stakeholders, representing a vital part of RL and its proposed radius of action, is almost absent in the evidence. Even corporate reputation is reported by employees and not by external respondents. Only one of the included studies used external stakeholders as respondents, reporting on their perception of leader attractiveness and organizational attractiveness of RL leaders compared to instrumental leaders. Therefore, RL's influence on external stakeholders, such as customer environmental consumption and supply chain fair wages, was not included in the empirical evidence.
With its close conceptual proximity to meso-level variables, such as CSR and the triple bottom line, it comes as no surprise that RL seems to positively influence these. Moreover, trade-offs between financial performance and social performance did not occur in the evidence. This is promising, because it indicates a kind of leadership with the potential to increase organizational responsibility working towards societal goals such as UN SDGs while also maintaining financial business capacity.
Mechanisms strengthening the relationship between RL and the proposed outcomes are leader identification, person-organization fit and identification with organization, which are all constructs akin to sharing values with the leader and with the organization.
Our overview (Figure 3) reveals that there is no research on the outcomes of RL at the intrapersonal level. Relevant factors here could be the leader's rewarding feelings of acting in accordance with own values and social capital. This research could also shed light on the potential negative intrapersonal effects of being a responsible leader, considering what we know from the field of whistle blowing. Assuming responsible decisions are not always favoured, unpopular decisions may cause stressful work situations.

Limitations and Future Research Agenda
The scope of our review is limited to research measuring RL only, and RL was not compared to data from related leadership constructs, such as ethical and transformational leadership. Therefore, our discussions on delineations between RL and said constructs is informed by RL studies only. Due to the explorative nature of the content analysis, the four core aspects were not tested for any statistical properties, such as factor analysis or interrater validity. The four proposed RL core aspects need further investigation to test for discriminant factors within the pool of items.
Despite the scope of our review being limited to private sector businesses, we recognize that RL could have relevance for other organizations. The four RL core aspects propose aspects of high importance to public sector leadership, such as stakeholder inclusion and value creation (e.g., public-private partnerships, citizenship, co-creation). The selected studies included a wide array of industries and seemed representative across private sector businesses, but the geographical scope indicates a skewed sample with an overrepresentation of Asian study contexts. We cannot draw conclusions on whether this current Asian dominance in the field is a trend, or a research sample bias. Nevertheless, within our scope studies from other continents are needed to shed light on generalizability across geographical areas and cultural differences relating to antecedents and outcomes of RL. Furthermore, what can be considered responsible leader behaviour may vary according to context and point in time. For instance, actions that are pro-environmental must be considered in relationship to other possible actions, and there is no absolute standard for determining what is pro-environmental [112]. This presents a challenge for designing RL questionnaires that are valid across cultures and time. While type of industry, cultural context and size of companies were included in many of the studies, organizational structure was not reported or controlled for. Regardless of cultural context, organizational structures such as traditional hierarchical organizations compared to other organizational structures could be explored for their influence on the effects of RL. In line with publication bias, there could be an underreport of studies that have found no correlations.
Additionally, a large proportion of the included studies did not report on matters related to language or translation of survey instruments. This could either indicate the application of survey instruments in their original language across countries, or it could represent a lack of reports on translation procedures in the studies.
Based on our findings, we suggest three main areas of interest for future research: (1) establish evidence for incremental validity, (2) explore causal and boundary mechanisms and (3) expand the stakeholder focus.

Establish Evidence for Incremental Validity
The wide span of leader behaviours and attitudes included in RL measures and its partial overlap with ethical and servant leadership imply that incremental validity needs strengthening to establish RL's unique contribution to the leadership literature. Clearer delineations were acquired to define what RL is and what it is not. Comparing nomological networks of RL with the equivalent of ethical leadership and servant leadership could be one possible approach to draw up the line between constructs. Further investigations of our suggested four RL core aspects could be another approach to explore and refine what RL is.

Explore Causal and Boundary Mechanisms
Current evidence indicates that in companies where the triple bottom line is high, both employees and leaders act responsibly. But without experimental and longitudinal designs, causality cannot be ascribed. Explorations of RL and its potential to transform irresponsible organizational behaviours could be a valuable approach.
Personal, contextual and social variables present opportunities for future research on RL antecedents. Our integrative overview points out areas of interest according to the four included levels: intrapersonal (e.g., leader self-efficacy, leader locus of control), microlevel (e.g., employee pro-environmental behaviour, employee-leader value congruence), meso-level (e.g., ethical climate, organizational culture) and macro-level (e.g., stakeholder pressure from the government, from customers, from competitors).
Following the lack of antecedents in the literature, boundary effects between antecedents and RL is likewise unexplored. Organizational factors such organizational structure, leader autonomy and organizational culture frames leaders' decision latitude and may weaken or strengthen the enactment of RL, in keeping with upper echelon theory.
Our review implies that RL does not just include behaviour, but also, to a large extent, behavioural intentions and, to some extent, systems. The study of motivations or drives for RL could inform which aspects of RL can be facilitated and learned, and which aspects are connected to personal predispositions such as empathy and personal values.

Expand the Stakeholder Focus
Granting stakeholder theory's centrality to RL, we observed a narrow stakeholder scope in the empirical evidence. Paradoxically, a feature crucial to RL's unique contribution, remains largely unexplored. Without research on external variables at the macro-level, RL's potential within this domain remains unknown. This substantial avenue for future research includes several branches of opportunities with questions, for example what is RL's ability to influence external stakeholders to take more responsibility. Can RL increase pro-environmental behaviour for customers, visitors, suppliers in supply chains and actors in local communities? An expansion of the scope of respondents is required to answer such questions, stepping outside of the organization to shed light on these aspects and to map social mechanisms such as trickle effects and social learning inside and outside of the organization. Issues related to specific industries or operations also represent a well of opportunities for research on RL. One specific topic could focus on RL and privatepublic partnerships. Additionally, considering the current challenge of climate change and business demands concerning the environmental bottom line, nature as a stakeholder should be explored.

Data Availability Statement:
The data that support the findings of this study are available from the author upon reasonable request.