Identifying Corporate Sustainability Issues by Analyzing Shareholder Resolutions: A Machine-Learning Text Analytics Approach

: Corporations have embraced the idea of corporate environmental, social, and governance (ESG) under the general framework of sustainability. Studies have measured and analyzed the impact of internal sustainability e ﬀ orts on the performance of individual companies, policies, and projects. This exploratory study attempts to extract useful insight from shareholder sustainability resolutions using machine learning-based text analytics. Prior research has studied corporate sustainability disclosures from public reports. By studying shareholder resolutions, we gain insight into the shareholders’ perspectives and objectives. The primary source for this study is the Ceres sustainability shareholder resolution database, with 1737 records spanning 2009–2019. The study utilizes a combination of text analytic approaches (i.e., word cloud, co-occurrence, row-similarities, clustering, classiﬁcation, etc.) to extract insights. These are novel methods of transforming textual data into useful knowledge about corporate sustainability endeavors. This study demonstrates that stakeholders, such as shareholders, can inﬂuence corporate sustainability via resolutions. The incorporation of text analytic techniques o ﬀ ers insight to researchers who study vast collections of unstructured bodies of text, improving the understanding of shareholder resolutions and reaching a wider audience.


Introduction
Environmental and climate change trends, as well as a general concern about the environment and environmental impact of business operations, have provoked public demand for corporate accountability and transparency [1][2][3][4]. In light of such pressure, companies are moving sustainability to the top of the corporate agenda [5][6][7][8]. The C-Suite is recognizing the need for more transparency regarding corporate sustainability practices and their impact not only on the environment, but on society as well [9,10]. To educate the public and stakeholders, sustainability information is being integrated into corporate disclosures (e.g., annual reports). In addition, business strategies are being redrawn to result in better environmental and social performance [8,11].
Corporate sustainability encompasses a range of aspects, among them: Corporate reputation, sustenance and expansion of economic growth, customer relationships, shareholder value, prestige, and the quality of products and services [12]; implementing and demonstrating social and environmental concerns in business transactions [13]; satisfying the current needs of stakeholders without compromising the capability to meet the needs of the future [14]; and demonstrating the ability to avail opportunities and manage risk in terms of economic, environmental, and social dimensions, inter-sectoral distributions. Third, to the best of our knowledge, machine learning-based text analytics methodology has not been extensively used to analyze shareholder resolutions. This technique is essential, considering the data is mostly textual. Our study fits well in this niche. Fourth, this study looks at corporate sustainability from the lens of the shareholder (rather than from the management). Text analysis offers the ability to examine the textual information without assuming a pre-defined list of terms. In analyzing shareholder sustainability resolutions, we focused on the following: Identification and evolution of sustainability issues; incorporation of environmental, economic and social aspects in the resolutions; and the variances in sustainability resolutions (and practices) between sectors [29,31]. We explore the research question: Do shareholder resolutions reflect corporate sustainability concerns in terms of environmental, social, and governance aspects?
The rest of the paper is organized as follows: Section 2 follows with a background on corporate sustainability and shareholder resolutions; Section 3 describes the methodology; Section 4 presents the analysis and results; Section 5 offers an integrated discussion; and lastly, Section 6 concludes with the scope and limitations of the current research and offers directions for future research.

Sustainability
In its 1987 report, the Brundtland Commission defined sustainability as an initiative for meeting the needs of the present, without compromising the ability of future generations to meet their own needs [61,62]. This view of sustainability has been accepted universally [31,63].
Sustainability, applied in an organizational context, is termed corporate sustainability. This term is variably defined [63]. While some research focuses on environmental aspects of sustainability, others focus on the social aspects, and yet others adopt an integrated view by combining the environmental, social, and economic aspects [63][64][65]. We envision corporate sustainability at the junction of environmental performance, economic contribution, and social responsibility [14,66] as posited by the triple bottom line perspective [22].
Environmental practices are concerned with the rate of utilization of natural resources and of emissions in the eco-system [14]. Conservation of resources and waste management are important elements [67][68][69]. Environmental performance is indicated by resource depletion, energy use, air emissions, biodiversity, noise, solid waste, transport, and water use and discharge [66].
Social practices are those that that are directed at maintaining quality of life in the community [14,69]. Some of the aspects include corporate philanthropy [67], corporate citizenship, social partnership, and social sponsorship [66]. They also cover human capital elements like employee training programs, health programs, child-care, and others [68].
Economic practices include corporate governance, risk management, emergency management, compliance [67,69], and economic profitability, and economic equity [70]. Organizations face pressure to be sustainable from several entities such as government [2,66,71], customers and employees [72], and management [73]. Organizations need to embed sustainability within their overall strategy in order to be successful [74,75]. One of the ways in which organizations manifest their sustainable activities is through publication of corporate sustainability reports [63]. Another is to look at shareholder resolutions that concern sustainability [38].

Shareholder Resolutions
Recently, shareholders have increased their participation in managerial oversight through private communication with managers, proxy contests, and shareholder proposals [58,76]. Due to their low cost, shareholder proposals give parties otherwise left out of the corporate governance process, a chance to voice their concerns over the management of the firm [60,76].
The process of shareholder proposal is governed by the Securities and Exchange Commission's (SEC) Shareholder Proposal Rule 14a-8, which regulates proxy voting. Rule 14a-8 gives shareholders the opportunity to submit initiatives over issues that directly affect the firm as a profit-making institution [76][77][78]. Shareholder activists can try and influence corporate behavior by trying to exercise voting rights on shareholder proposals or by entering into dialogue with companies on a particular issue [38]. They can sell shares over an issue, prepare and lodge shareholder resolutions, vote on proposals, and engage in dialogue with the management on particular issues.
Some academic research has examined the trend of shareholder activism for corporate social responsibility (CSR) and/or environmental and social issues [38,79,80]. In the United States, shareholder activism has existed for decades [38] and shareholders have historically pressured companies to address social issues through the proxy voting process for many years [81]. The movement gained popularity in 1946 with the SEC adopting rule 14a-8, requiring companies to include shareholder resolutions in their proxy statements [38] in addition to annual meetings. In the 1960s and 1970s, shareholders pressured companies about issues including employment and training, civil rights and equal opportunities, and weapons production [80]. These efforts resulted in changes to the rules for proposals to shareholder meetings. In 1970, the US courts issued a ruling against the SEC after it disallowed a proposal intending to force the Dow Chemical Company to stop producing napalm during the Vietnam War [38]. This decision paved the way for other social responsibility proposals to follow [81]. For example, Ralph Nader's landmark Campaign GM and Project on Corporate Responsibility used shareholder proxy to pressure General Motors (GM) into being more responsible of societal needs in areas like product safety, environmental pollution, and employment discrimination [82][83][84].
During the 1980s, shareholder activism in the U.S. tended to swing toward anti-takeover activities. This was a result of public focus on making companies more competitive and adopt social and environmental responsibility [38]-an example is how, in the wake of the Exxon Valdez disaster, the investor and environmental activist alliance CERES was formed, bringing public focus on the need for improved environmental disclosure by companies to stakeholders [38]. The groups also explored how the management of these issues could be better integrated into the company [38]-an example is the mandatory implementation by companies of international standards like ISO 14001 and the International Labor Standards (ILO) [85]. Overall, in the late 1990s, the phenomenon of activism was characterized by a spirit of cooperation, rather than confrontation [86], directing the focus on CSR with mutual engagement and stakeholder dialogue [38,87].
Since most shareholder proposals are advisory in nature, management is not required to implement proposals, even if they receive majority support (see [88][89][90] for discussion of rules). However, while historically, corporate meetings were only a nominal event to symbolically implement executive intentions, with the increase in social activism, these meetings have taken on a new light [38,39,60]. The Institutional Shareholder Services (ISS) published that, in the face of corporate scandals and reforms, disclosure of social and environmental issues has come to the forefront of the corporate governance agenda [60,91]. This opened the door for shareholders to file resolutions and obligated management to distribute these resolutions with proxy statements. Resolutions became a more effective tool for shareholders to gain management attention [43,58,60]. Some management consulting firms like PricewaterhouseCoopers (PwC) even published manuals to help CEOs anticipate and deal with social and environmental issues at annual shareholder meetings [43,92].
The socially oriented shareholder activism movement has had success with several initiatives. For one, during the apartheid reign, shareholder activism forced firms to withdraw from South Africa [93]. Similarly, McDonalds was forced to stop using polystyrene clamshell packaging materials [38]. More recently, companies were pushed towards more disclosure of greenhouse gas emissions and climate change [94]. Corporations continue to face societal pressures to improve their business citizenship practices and embrace concepts like social justice, sustainability, and the triple bottom line [77,[95][96][97]. The shareholder resolutions related to improvements in the performance of a company, specifically with regard to the triple bottom line, have a marked effect on corporate management [58,[97][98][99][100].
Through these, shareholders urge the management to adopt a proactive rather than a reactive strategy, while still maintaining or enhancing their reputation and competitive advantage [97,99].
In addition to resolutions, activist stakeholders can deploy tactics such as media campaigns, negotiations, lawsuits, boycotts, lawsuits, and lobbying legislators, in their efforts to pressure companies into altering their social agendas [38,60,77,101]. There is interactive engagement of both shareholders and companies [38,77] in the area of CSR [38,58]. In the context of interaction between companies and shareholders, there is speculation from environmental and social activists on the power and influence of shareholders. Many shareholders are turning into active lobbyists [38,43,58,60]. They influence corporate behavior by enforcing ownership rights through the process of preparing or voting on shareholder proposals and engaging in direct dialogue with management [38,43,77]. Some academic research has examined the trend of shareholder activism for CSR and/or environmental and social issues [79,80].
Shareholder resolutions have attracted activists who are interested in influencing corporate behavior [79]. Ever since the decline of corporate takeover activity in the late 1980s, shareholder resolutions have been a venue for fostering changes by the board of directors as well as by top management. Such resolutions have become a viable instrument for activists to make demands on management for changes in corporate practices [79].
Changes desired by shareholder activists in the form of resolutions can range from managerial performance or governance issues to social issues [79]. Some examples of social issues addressed are human rights, refraining from military contracting, and changing executive compensation [79]. Considering the importance of shareholder resolutions in disseminating sustainability-related ideas, it is imperative to elicit key insights from this large corpus of information. In the following section we discuss the relevance of using text analytics as a machine learning technology, specifically for analyzing shareholder resolutions.

Text Analytics and Machine Learning
Text analytics as a machine learning approach has gained popularity with the increasing availability of electronic documents from a multitude of sources. Electronic data, in general, can be structured or unstructured. Structured data is well-defined, organized, and easily searchable; while unstructured data comprises complex data that are of free form (little to no structure), comprising multiple types including audio, video, and/or graphics. Examples of unstructured data sources include the World Wide Web, electronic corporate documents such as annual reports/resolutions, digital libraries, online forums, online news channels, chat rooms, electronic mail and blog repositories, among others [102]. Considering the ubiquity of unstructured data, extracting knowledge from these is critical as a research and application area.
Natural language processing, machine learning, and text analytics automatically classify and discover patterns from electronic documents. By using natural language processing, text analytics can transform unstructured data into a structured form that is suitable for analysis and for applying machine learning algorithms. By using text analytics, researchers can assess various dimensions of core concepts that they want to pursue in the unstructured data. Most text analytics studies use thesaurus-like dictionaries comprised of words or phrases with shared meanings [103]. In order to analyze a corpus, the method involves assessing the frequencies of entries and categories and evaluating the relative importance of central concepts in the text. The advantage of text analytics is its capacity to analyze large volume of data [104]. In the current research context, analyzing shareholder resolutions offers an appropriate domain. For one, it is a large corpus of unstructured text. It also contains important sentiments of shareholders and presents an appropriate repository to explore. Companies communicate their commitment to various public causes including sustainability through reports and resolutions. Therefore, shareholder resolutions present an important research area worth exploring. The methodology of text analytics has been used in research on analyzing unstructured textual information such as tweets on vaccination [105], legal petitions [106], or health blogs [107].

Methodology
This exploratory study therefore examined shareholder resolutions of companies to elicit the broad sustainability practices and trends using machine learning-based text analytic approaches. The source of the resolutions is the climate and sustainability shareholder resolutions database from CERES, a sustainability, non-profit organization with numerous sustainability-related databases [108][109][110][111][112]. The machine learning-based text analytics approach enables the processing and study of a large corpus of textual data in an efficient manner [31,36,113]. The insight gained from the study alerts companies' management, stakeholders, ESG proponents, policy makers, and others as to the implications and future direction of ESG in companies.
We retrieved approximately 2100 records of the shareholder resolutions (https://www.ceres.org/ resources/tools/climate-and-sustainability-shareholder-resolutions-database) using a Python selenium package from www.ceres.org. Our primary sample included publicly traded firms from 2009 to 2019. Out of the total of 2100 records, we had to eliminate 363 records due to missing/incomplete content or non-functional links. The final data set consisted of 1737 resolutions from 545 companies (as of 10 April 2019). We present several descriptive analyses from processing the extracted text data.
In addition to the overall analysis of resolutions for sustainability practices, we pursued an additional exploratory question relating to the potential to predict the voting status of a resolution. That is, given the features from the resolutions, can voting status be predicted? In the case of public companies in the United States, a shareholder resolution is a proposal submitted by shareholders for a vote at the company's annual meeting. The CERES shareholder resolution database contains several categories of the voting status of resolutions-such as "Vote", "Withdrawn", "Omitted", "Filed", "No Vote", "Sent", and "Floor Proposal." For the purpose of our research, we aggregated the categories other than "Vote" and "Withdrawn" into "Other". Therefore, we considered the three voting statuses of: "Vote" to denote resolutions that were submitted for vote; "Withdrawn" to denote resolutions that were withdrawn; and "Other" to denote resolutions in all other categories. Table 1 presents the top 10 companies with most resolutions in the database. Dominion Energy, Inc., Chevron Corporation, and Exxon Mobil Corporation contributed about 8% of total resolutions. Interestingly, Amazon.com Inc. accumulated 20 resolutions and ranked fourth in the dataset. This data is an indication that the shareholders of these companies are aware of and pay attention to climate and sustainability issues. It may also indicate that these companies are more transparent when disclosing their shareholder resolutions regarding climate and sustainability to the public.  and (5) Food Products. These five industries account for 686 out of the 1737 resolutions. According to Figure 1, the Oil, Gas & Consumable Fuels sector contributed to nearly 30% of the total resolutions. The top three sectors contributed to over 50% of the total resolutions.  Figure 2 confirms that the Oil, Gas & Consumable Fuels sector filed the greatest number of resolutions during this timeframe. This sector peaked in 2016 as a result of increased awareness and political and social activism surrounding climate change during this period. Overall, resolutions started to decline after 2017. The stacked chart in Figure 3 shows the dominance of the Oil, Gas & Consumable Fuels sector over the others in terms of the volume of resolutions filed. This is followed by the Electric Utilities sector.    Table 3 displays the distribution of voting status over the study period. The voting status was extracted from each resolution. As mentioned, the voting statuses were coded as "Voted", "Withdrawn", or "Other." Most of the time, the resolution is either voted or withdrawn. This implies that the resolution was voted upon or withdrawn from consideration. However, the database is limited in that we do not know the nature of the vote (whether it passed). The "Other" status comprises the miscellaneous statuses like "Omitted", "Filed", "No Vote", "Sent", and "Floor Proposal".  Figure 4 shows the distribution of voted resolutions by year. The trend fluctuated from 2009 to 2013. However, there is a sharp upward trend from 2014 to 2018. Its dramatic rise indicates that shareholders escalated the use of resolutions as a lever to alert management to corporate climate change, ESG, and sustainability initiatives and practices. During this period of activism, resolutions tended to get a vote. As of 2019, the trend line drops due to limited data availability through April 2019.

Text Analytics
Next, we apply machine learning-based text analytics to study the resolutions. Data pre-processing was undertaken prior to applying the machine learning-based text analytics. The retrieved data for each resolution was saved as a text file prior to vectorization. A few resolutions had to be removed for lack of clarity/relevance and/or non-functional links.
Predictive models developed with text data pose several challenges to the modeling process. First, textual data cannot be used as inputs in many mathematical models. Therefore, a natural language processing (NLP) system was implemented to transform the textual content into integral elements for further analysis. Second, text-based data sets are larger in size as compared to numerical data sets. Therefore, a successful model necessitates information extraction by identifying the most relevant pieces of the data. In the data pre-processing stage, the scraped, raw summary section was transformed into plain text documents through the elimination of punctuation, spaces, numbers, and standard stop words using the Natural Language Toolkit (NLTK). Next, the text was converted to lower case using NLTK and TextBlob (https://textblob.readthedocs.io/en/dev/). Following this, the lemmatization technique was applied to convert the words into their root form (for example, "bought" and "buying" were substituted with "buy"). Lemmatization works by grouping together the different inflected forms of a word, which enable it to be analyzed as a single term. In this way, it brings context to the words. The pandas package (https://pandas.pydata.org/) was applied to filter and manipulate scraped data into data frames for analysis. The sklearn package (https://scikit-learn.org/stable/index.html) was used to refine the results. The maximum number of features was set to approximately 4000. Tokenized words included those with more than four characters. These efforts reduced the downstream of a generated model.
The 'term frequency-inverse document frequency' (TF-IDF) technique was used to compute the weight of each term to signify its importance in a document. This is an information retrieval technique that assigns a weight to a term's frequency (TF) and its inverse document frequency (IDF). Each term is given these two scores. The weight of the term is then calculated as a product of these scores. A more detailed explanation of the technique is given in the Results section. The KMeans clustering model was applied to elicit the key sustainability concepts. KMeans clustering is one of the popular machine learning algorithms deployed to classify cases based on similarity measures (that is, the distance between cases). It is typically used in pattern recognition and classification domains. Seven clusters were identified using a word cloud package. This was followed by classification with the KNN classifier (a supervised machine learning algorithmic method), wherein the data was split into two parts to measure the success of the classifier (training and testing). Table 4 summarizes the key variables extracted from the corpus of text data. Textual disclosure of such report CERES tracks shareholder disclosures filed by the investor network participants pertaining to enterprises' sustainability issues, including energy consumption, climate change, water scarcity, and sustainability initiatives reporting. These disclosures are part of broader investor efforts to address the full range of ESG issues. The disclosures are filed by many of the largest public pension funds, foundations, and religious, labor, and socially responsible investors in the U.S. Many of the investors are members of CERES' Investor Network on Climate Risk and Sustainability. A resolution decision variable (label) is used as the response variable when building a prediction model. The "status" variable was transformed into a numerical integer to suit the machine learning algorithm. The transformed labels for "status" are Vote = 1, Withdraw = 0, and Others = 2. This was labeled as voting status indicator. A critical factor in this study is that the textual data source contributes to the forecasting of the voting status of the shareholder resolution. The shareholder can examine the initiatives that may be discussed, as well as open it for a vote at the corporate level. The study's predictive model focused on the main text body of disclosures. It contains a narrative explanation of environmental issues, relative impacts, and future directions related to coping with such problems. The goal is to predict the voting status of a given disclosure to learn stakeholders' sentiments toward specific ESG issues.

Results and Analysis
This section presents the results of the machine learning-based text analytics performed on the resolutions data set. To surface out overlapping crucial phrases, we validated the results by mapping and comparing the critical topics discovered in the word cloud analysis against the co-occurrence analysis. Subsequently, the cluster analysis was conducted on the main themes and associated topics to identify core concepts and essential factors within each cluster. Lastly, prediction of the vote decision was attempted.
The word cloud map in Figure 5 displays the words most frequently found in the corpus of resolutions. This was generated with wordclouds.com. The larger the size of a word, the more frequently it occurs in the corpus. This affirms what we know via exploratory manual reading and analysis of the key implicit issues in the resolutions. The word cloud shows that the top issues of shareholders as found in the resolutions are lobbying, disclosure, climate, risk, emissions, sustainability, and disclosure (indicated in the center of the map). Therefore, a key takeaway is that shareholders in companies understand the significance and impact of sustainability and environmental issues like gas emission. In fact, they want top management to pay attention to these matters. Figure 6 presents the top 35 words in the shareholder resolutions' reports. The WordCount function gives a frequency count of how often a word occurs. The most frequent words include report, shareholder, lobbying, climate, and risk. This confirms the high-level sustainability issues that are displayed in the word cloud in Figure 5. However, keywords by themselves do not contribute much to a deeper understanding of shareholders' interests and motivations regarding sustainability.  Therefore, we next examined the co-occurrence of words. In linguistics, co-occurrence represents the likelihood of occurrence of two terms in a certain order alongside each other within a large corpus of data. In this sense, it is used as an indicator of semantic closeness of terms [114]. This model gives insight as to which issues are related. Figure 7 displays the result of the top 30 bigram, showing the frequency of two words coming together in the data set. Interesting patterns emerge from this analysis. For instance, "climate" and "change" are found together 2300 times. This indicates that shareholder resolutions are centered around climate change. Similarly, the combination of the top five bigrams shows that stakeholders are attempting to make positive changes on issues like climate change and greenhouse gas through both direct and indirect lobbying.  Next, we generated row similarities using the term frequency-inverse document frequency (TF-IDF) technique. This algorithm searches through a corpus of documents and determines which words are favorable to use in a query. For each word in a document, a value is calculated as an inverse proportion of the frequency of occurrence of the word in the document to the percentage of documents the word occurs [115]. A high TF-IDF indicates a strong relationship with the document in which it occurs. This implies that if the word were used in a query, then the document would be relevant and of interest to the user [115]. This technique ensures efficient query retrieval with relevant words. The row similarity analysis indicates the parallels between resolutions. Cosine-similarity, a branch of row similarity, calculates a numeric quantity to denote the similarity between groups of text [115]. Different cosine-similarity values represent angles and varied orientation between the two documents. Here, we define a function that takes a resolution title as input. It then outputs a list of the 10 most similar resolutions by implementing the concept of cosine-similarity on the resolutions. Figure 9 shows the approach to identifying the top resolutions that are most relevant. For example, keywords and labels of similar resolutions are generated when a resolution pertaining to renewable energy is input to the model. This offers both shareholders and top management the insight into a resolution's voting outcome based on other resolutions.  We next explored the potential to identify and organize the key climate change and sustainability issues shareholders are most concerned with by applying the techniques of clustering and classification in an exploratory way. Clustering helps identify the keywords that are central to the resolution; classification can help classify new resolutions into these keywords. The K-means algorithm was applied to generate the clusters. Figure 10 shows the number of keywords in each cluster. Cluster 1, as shown in Figure 11A, has 449 sustainability resolutions with terms like carbon, climate change, emission, energy, methane, renewable, etc. This indicates that the resolutions in this cluster are associated with emission and energy. Therefore, one could tentatively label this cluster as "emission and energy." This cluster is tentatively termed "goal." Activism has primarily been driven by an approach where investors choose one topic, such as climate change or diversity [58].   The second cluster (cluster 2), shown in Figure 11B, has 292 resolutions. This cluster identifies key concerns of shareholders like diversity (gender, race, etc.), board, committee, compensation, executive pay, voting, etc. Therefore, this is grouped under the general term of "board." The issues in cluster 3 ( Figure 11C), with 193 resolutions, are primarily communication, legislation, lobbying, oversight, etc. These tend to reflect "regulation". Cluster 4 ( Figure 11D), with 106 resolutions, emphasizes the communication of sustainability ideas via lobbying with the government, campaigning, disclosure, expenditure, transparency, etc. This results in the possible cluster label of "politics".
Cluster 5, as displayed in Figure 11E, has 42 resolutions. It refers to issues like by-law, ESG, governance, metric, managing, proxy, etc. Overall, this cluster appears to describe "governance". The governance issues in our sample comprise primarily political lobbying, corruption, and board oversight regarding environmental and social issues [58]. Therefore, this cluster is related to cluster 2, namely, "Board".
Next, cluster 6 ( Figure 11F) mentions a variety of products and services at the core of climate change and sustainability. These include deforestation, drugs, food, packaging, plastic, recycling, etc. The products, and their related activities, are essential to human life. However, the process of making them can be improved through sustainability practices like proper recycling and bio-degradable packaging. Therefore, one may assign it the label "Product Management".
The final cluster, cluster 7 ( Figure 11G), has 218 resolutions. It mentions a variety of core concepts of climate change, ESG, and sustainability practices within companies. Its key words include guidelines, environmental, reporting, responsibility, sustainability, etc. Overall, this cluster can be labeled "Accountability".
Cluster analysis focuses on data position and the visual structure of the data within a cluster. This allows us to pinpoint important nodes (in this case, a keyword) and key findings. Next, we turn our attention to classification. This allows us to engage in predictive analytics. Given the resolution, stakeholders may ask if one can accurately classify the voting status of the resolution.
In this exploratory research, the K-nearest neighbor algorithm is used to build the classification model. Figure 12 presents the number of neighbors (x-axis) and model accuracy (y-axis). Testing accuracy peaks at 61% with five neighbors, indicating the model can predict the voting status of a given resolution with 61% accuracy. With further iteration and refinement of the models, this can be improved further.

Discussion
Increasingly, activist shareholders are attempting to pass resolutions on sustainability to gain the attention of management. By proposing and putting them to vote, shareholders are bringing attention to climate change, ESG, and sustainability causes. Therefore, shareholder resolutions are a key source of information on the shareholders' sentiments regarding sustainability. By extrapolation, management and other stakeholders can gain insight into the shareholders' sentiments.
This exploratory study used several machine learning-based methods in text analytics to analyze and elicit key information from 1737 shareholder resolutions over a 11-year period. The research capitalizes on advances in information processing technology in extracting insight from large corpuses of text, a process that was only entrusted to subjective evaluation in the past. In the process, and in line with our research question, we identified seven key clusters (topics) related to sustainability, representing shareholders' main concerns. Our analysis shows that shareholders are concerned about topics related to emission and energy, boards, regulations, politics, governance, product management, and accountability. The identification of these key topics is consistent with results from other studies related to ESG and sustainability reports, typically from a managerial perspective. It therefore appears that both shareholders and management are interested in similar themes.
Analyzing shareholder resolutions allows a window into the landscape of issues that shareholders are passionate about. Analyzing shareholder resolutions also helps companies foresee the evolution of these issues in the public eye. In addition, it helps companies tell whether they are likely to become targets of public scrutiny related to the issues [79].

Conclusions
This exploratory study attempts to gain insight into the climate change and sustainability concerns of shareholders by examining shareholder resolutions. However, there are many other issues warranting investigation. For example, to what extent can text documents such as the resolutions be used to forecast sustainability trends and managerial action. Trend analysis can be performed to examine attitudes and impact over time. The impact of resolutions on the public in general and vice versa can be examined from a social media perspective (public sentiment analysis). Reproducibility and validation are key challenges in the application of machine learning and text analytics. Additionally, the intuitive labeling of clusters and interpretation of the machine learning likely make the interpretation subjective. Yet, we are confident that the results of the analysis, for example, the labeling of the clusters, generally represent what is described in the resolutions, as confirmed by prior literature. Thus, it provides a powerful methodology to study and understand sustainability. Additional limitations include the reliability of the documents and validity of data preparation techniques. Generalizability of the topics (clusters) may be uncertain by examining a small sample of resolutions. They may not characterize overall corporate efforts and initiatives regarding sustainability. Further, machine learning models are only capable of extracting a limited amount of insight. Additionally, this study used the summary section as the sole source of unstructured data. Future research should investigate the value of certain sections on a company's 10K reports and user-generated content (e.g., tweets, blog postings, etc.).
Despite the limitations, our study contributes to policy and research in two ways. First, practitioners and researchers can utilize the results to prioritize sustainability initiatives and examine some of the clusters via different lenses like those of non-governmental organizations (NGOs) or other activists. Second, the study demonstrates the efficacy of machine learning and text analytics in understanding and gaining insight into shareholder resolutions to make informed decisions by conducting descriptive and predictive analytics. Third, we demonstrate the evolutionary trend of how shareholder resolutions can be an effective venue for fostering changes in corporate practices by corporate management.
Shareholders, being a large and prominent group of stakeholders, can effectively project a positive image of the company with transparency. In fact, regardless of whether a resolution is passed or implemented, the resolution being publicly available creates a positive impact on the dialogue regarding corporate sustainability. Future research can continue to explore and apply advanced techniques like deep learning to delve into shareholder resolutions and disclosure reports for additional content analysis. For example, prescriptive analytics can be investigated to not only predict outcomes, but also suggest likely impacts and potential strategies.
Other avenues for future research include conducting cross industry comparisons, studying global differences, and examining resolutions that affect corporate sustainability and its correlation to company performance. Sophisticated application of advanced techniques, such as artificial intelligence and deep learning, will accelerate the maturing process of gaining insight from shareholder resolutions on climate change and sustainability. Even though most proposals of shareholders in the domain of ESG obtain below 20% votes in support [58], these proposals nevertheless serve to be significant catalysts of action within companies [116] and persuade management to take effective action. Additionally, applying discovery analytics on the resolutions may shed light on innovation and new product ideas.