Environmental and climate change trends, as well as a general concern about the environment and environmental impact of business operations, have provoked public demand for corporate accountability and transparency [1
]. In light of such pressure, companies are moving sustainability to the top of the corporate agenda [5
]. The C-Suite is recognizing the need for more transparency regarding corporate sustainability practices and their impact not only on the environment, but on society as well [9
]. To educate the public and stakeholders, sustainability information is being integrated into corporate disclosures (e.g., annual reports). In addition, business strategies are being redrawn to result in better environmental and social performance [8
Corporate sustainability encompasses a range of aspects, among them: Corporate reputation, sustenance and expansion of economic growth, customer relationships, shareholder value, prestige, and the quality of products and services [12
]; implementing and demonstrating social and environmental concerns in business transactions [13
]; satisfying the current needs of stakeholders without compromising the capability to meet the needs of the future [14
]; and demonstrating the ability to avail opportunities and manage risk in terms of economic, environmental, and social dimensions, so as to create long-term shareholder value [15
]. Despite the multiplicity of definitions, the common thread is an underlying corporate commitment to sustainable economic development and improved quality of life for employees, the community, and society [16
Companies demonstrate commitment to sustainability by adopting and communicating practices to stakeholders. This practice is referred to as sustainability disclosure or sustainability reporting [3
]. While financial reporting covers the financial aspects of a company’s performance, sustainability reporting covers the non-financial aspects [20
]. The triple bottom line perspective [22
] emphasizes three facets in sustainability reporting: Economic, environmental, and social growth [23
]. This view suggests that companies have a responsibility to not only make financial profits and maintain economic growth, but also consider their environmental and societal impacts [25
]. More and more companies have started publicly disclosing their sustainability practices [28
Companies often communicate sustainability information along with financial information in their annual sustainability reports [29
]. These reports contain information on the environmental, economic, and social impacts from an organization’s day-to-day activities [30
]. Research has concentrated on how companies address the challenge of sustainability through disclosure in sustainability reporting. Some studies focus on the frequency of reporting [28
], while others deploy qualitative content analysis [27
] or quantitative text analysis [27
] of reporting practices. Most of these studies have only considered a limited number of reports. In contrast, [31
] employed text mining techniques to conduct topic modeling on 9514 sustainability reports published between 1999 and 2015. The study applied Latent Dirichlet allocation (LDA) to identify themes within large collections of documents [37
]. In recent years, a trend has emerged in which shareholder activists form networks to empower shareholders and magnify their voices [38
] to demand corporate responsibility—particularly environmental responsibility [25
]. Activists continue to criticize large corporations (i.e., BP, Chevron, Exxon, Shell, Monsanto) for having a negative impact on the environment [40
]. Given the pressure to improve corporate environmental performance [41
], combined with the heavy price for environmental corporate misdeeds [42
], it is no surprise that environmental issues are one of the most cited reasons for shareholder resolutions to be presented and voted on through the proxy vote at annual meetings [43
]. Additionally, with the heightened public awareness of climate change, companies are also realizing the significance of sustainability on local economies and cultures [13
]. Investors and stakeholders are increasingly interested in the effects of environmental, social, and governance (ESG) issues on the sustainability of business models as they focus on their ability to generate long-term value [46
]. Therefore, it is imperative that a socially responsible company integrate both the interest of stakeholders and the well-being of the environment [50
]. These efforts exploit growth opportunities while giving back to the community with environmental improvements [54
]. A consequence of increased stakeholder involvement is increased shareholder activism. This is reflected through shareholder resolutions on sustainability issues, with the number of resolutions gradually increasing [58
]. Shareholder resolutions offer a unique lens to study corporate sustainability practices and shed light on the most important issues [43
Our motivation for the current research arises from certain premises: First, we acknowledge that there is public demand for increased corporate commitment to sustainability; second, in response to the increased public demand, companies are eager to be involved in environmental, social, and governance activities; and third, companies communicate such involvement through avenues such as shareholder resolutions and/or annual reports. Our exploratory research, therefore, deploys a machine-learning-based text analytics approach to identify and analyze the key sustainability issues in the resolutions proposed by shareholders.
We contribute to current research on sustainability reporting in many ways. First, we use current data on shareholder resolutions published as recently as 2019 and look at an extended timeframe to include multiple years. Second, we analyze a significantly large number of shareholder resolutions. By extending the timeframe and number of resolutions, we extend the coverage to many sectors and many years. This also enables us to highlight the evolution of topics over a period of time, as well as explore inter-sectoral distributions. Third, to the best of our knowledge, machine learning-based text analytics methodology has not been extensively used to analyze shareholder resolutions. This technique is essential, considering the data is mostly textual. Our study fits well in this niche. Fourth, this study looks at corporate sustainability from the lens of the shareholder (rather than from the management). Text analysis offers the ability to examine the textual information without assuming a pre-defined list of terms. In analyzing shareholder sustainability resolutions, we focused on the following: Identification and evolution of sustainability issues; incorporation of environmental, economic and social aspects in the resolutions; and the variances in sustainability resolutions (and practices) between sectors [29
]. We explore the research question:
Do shareholder resolutions reflect corporate sustainability concerns in terms of environmental, social, and governance aspects?
The rest of the paper is organized as follows: Section 2
follows with a background on corporate sustainability and shareholder resolutions; Section 3
describes the methodology; Section 4
presents the analysis and results; Section 5
offers an integrated discussion; and lastly, Section 6
concludes with the scope and limitations of the current research and offers directions for future research.
This exploratory study therefore examined shareholder resolutions of companies to elicit the broad sustainability practices and trends using machine learning-based text analytic approaches. The source of the resolutions is the climate and sustainability shareholder resolutions database from CERES, a sustainability, non-profit organization with numerous sustainability-related databases [108
]. The machine learning-based text analytics approach enables the processing and study of a large corpus of textual data in an efficient manner [31
]. The insight gained from the study alerts companies’ management, stakeholders, ESG proponents, policy makers, and others as to the implications and future direction of ESG in companies.
We retrieved approximately 2100 records of the shareholder resolutions (https://www.ceres.org/resources/tools/climate-and-sustainability-shareholder-resolutions-database
) using a Python selenium package from www.ceres.org
. Our primary sample included publicly traded firms from 2009 to 2019. Out of the total of 2100 records, we had to eliminate 363 records due to missing/incomplete content or non-functional links. The final data set consisted of 1737 resolutions from 545 companies (as of 10 April 2019). We present several descriptive analyses from processing the extracted text data.
In addition to the overall analysis of resolutions for sustainability practices, we pursued an additional exploratory question relating to the potential to predict the voting status of a resolution. That is, given the features from the resolutions, can voting status be predicted? In the case of public companies in the United States, a shareholder resolution is a proposal submitted by shareholders for a vote at the company’s annual meeting. The CERES shareholder resolution database contains several categories of the voting status of resolutions—such as “Vote”, “Withdrawn”, “Omitted”, “Filed”, “No Vote”, “Sent”, and “Floor Proposal.” For the purpose of our research, we aggregated the categories other than “Vote” and “Withdrawn” into “Other”. Therefore, we considered the three voting statuses of: “Vote” to denote resolutions that were submitted for vote; “Withdrawn” to denote resolutions that were withdrawn; and “Other” to denote resolutions in all other categories.
3.1. Descriptive Data Analysis
presents the top 10 companies with most resolutions in the database. Dominion Energy, Inc., Chevron Corporation, and Exxon Mobil Corporation contributed about 8% of total resolutions. Interestingly, Amazon.com Inc. accumulated 20 resolutions and ranked fourth in the dataset. This data is an indication that the shareholders of these companies are aware of and pay attention to climate and sustainability issues. It may also indicate that these companies are more transparent when disclosing their shareholder resolutions regarding climate and sustainability to the public.
presents the 64 industries that filed shareholder resolutions. The top five sectors are: (1) Oil, Gas & Consumable Fuels; (2) Electric Utilities; (3) Hotels, Restaurants & Leisure; (4) Multi-Utilities; and (5) Food Products. These five industries account for 686 out of the 1737 resolutions. According to Figure 1
, the Oil, Gas & Consumable Fuels sector contributed to nearly 30% of the total resolutions. The top three sectors contributed to over 50% of the total resolutions.
confirms that the Oil, Gas & Consumable Fuels sector filed the greatest number of resolutions during this timeframe. This sector peaked in 2016 as a result of increased awareness and political and social activism surrounding climate change during this period. Overall, resolutions started to decline after 2017. The stacked chart in Figure 3
shows the dominance of the Oil, Gas & Consumable Fuels sector over the others in terms of the volume of resolutions filed. This is followed by the Electric Utilities sector.
displays the distribution of voting status over the study period. The voting status was extracted from each resolution. As mentioned, the voting statuses were coded as “Voted”, “Withdrawn”, or “Other.” Most of the time, the resolution is either voted or withdrawn. This implies that the resolution was voted upon or withdrawn from consideration. However, the database is limited in that we do not know the nature of the vote (whether it passed). The “Other” status comprises the miscellaneous statuses like “Omitted”, “Filed”, “No Vote”, “Sent”, and “Floor Proposal”.
shows the distribution of voted resolutions by year. The trend fluctuated from 2009 to 2013. However, there is a sharp upward trend from 2014 to 2018. Its dramatic rise indicates that shareholders escalated the use of resolutions as a lever to alert management to corporate climate change, ESG, and sustainability initiatives and practices. During this period of activism, resolutions tended to get a vote. As of 2019, the trend line drops due to limited data availability through April 2019.
3.2. Text Analytics
Next, we apply machine learning-based text analytics to study the resolutions. Data pre-processing was undertaken prior to applying the machine learning-based text analytics. The retrieved data for each resolution was saved as a text file prior to vectorization. A few resolutions had to be removed for lack of clarity/relevance and/or non-functional links.
Predictive models developed with text data pose several challenges to the modeling process. First, textual data cannot be used as inputs in many mathematical models. Therefore, a natural language processing (NLP) system was implemented to transform the textual content into integral elements for further analysis. Second, text-based data sets are larger in size as compared to numerical data sets. Therefore, a successful model necessitates information extraction by identifying the most relevant pieces of the data. In the data pre-processing stage, the scraped, raw summary section was transformed into plain text documents through the elimination of punctuation, spaces, numbers, and standard stop words using the Natural Language Toolkit (NLTK). Next, the text was converted to lower case using NLTK and TextBlob (https://textblob.readthedocs.io/en/dev/
Following this, the lemmatization technique was applied to convert the words into their root form (for example, “bought” and “buying” were substituted with “buy”). Lemmatization works by grouping together the different inflected forms of a word, which enable it to be analyzed as a single term. In this way, it brings context to the words. The pandas package (https://pandas.pydata.org/
) was applied to filter and manipulate scraped data into data frames for analysis. The sklearn package (https://scikit-learn.org/stable/index.html
) was used to refine the results. The maximum number of features was set to approximately 4000. Tokenized words included those with more than four characters. These efforts reduced the downstream of a generated model.
The ‘term frequency-inverse document frequency’ (TF-IDF) technique was used to compute the weight of each term to signify its importance in a document. This is an information retrieval technique that assigns a weight to a term’s frequency (TF) and its inverse document frequency (IDF). Each term is given these two scores. The weight of the term is then calculated as a product of these scores. A more detailed explanation of the technique is given in the Results section. The KMeans clustering model was applied to elicit the key sustainability concepts. KMeans clustering is one of the popular machine learning algorithms deployed to classify cases based on similarity measures (that is, the distance between cases). It is typically used in pattern recognition and classification domains. Seven clusters were identified using a word cloud package. This was followed by classification with the KNN classifier (a supervised machine learning algorithmic method), wherein the data was split into two parts to measure the success of the classifier (training and testing). Table 4
summarizes the key variables extracted from the corpus of text data.
CERES tracks shareholder disclosures filed by the investor network participants pertaining to enterprises’ sustainability issues, including energy consumption, climate change, water scarcity, and sustainability initiatives reporting. These disclosures are part of broader investor efforts to address the full range of ESG issues. The disclosures are filed by many of the largest public pension funds, foundations, and religious, labor, and socially responsible investors in the U.S. Many of the investors are members of CERES’ Investor Network on Climate Risk and Sustainability. A resolution decision variable (label) is used as the response variable when building a prediction model. The “status” variable was transformed into a numerical integer to suit the machine learning algorithm. The transformed labels for “status” are Vote = 1, Withdraw = 0, and Others = 2. This was labeled as voting status indicator. A critical factor in this study is that the textual data source contributes to the forecasting of the voting status of the shareholder resolution. The shareholder can examine the initiatives that may be discussed, as well as open it for a vote at the corporate level. The study’s predictive model focused on the main text body of disclosures. It contains a narrative explanation of environmental issues, relative impacts, and future directions related to coping with such problems. The goal is to predict the voting status of a given disclosure to learn stakeholders’ sentiments toward specific ESG issues.
4. Results and Analysis
This section presents the results of the machine learning-based text analytics performed on the resolutions data set. To surface out overlapping crucial phrases, we validated the results by mapping and comparing the critical topics discovered in the word cloud analysis against the co-occurrence analysis. Subsequently, the cluster analysis was conducted on the main themes and associated topics to identify core concepts and essential factors within each cluster. Lastly, prediction of the vote decision was attempted.
The word cloud map in Figure 5
displays the words most frequently found in the corpus of resolutions. This was generated with wordclouds.com. The larger the size of a word, the more frequently it occurs in the corpus. This affirms what we know via exploratory manual reading and analysis of the key implicit issues in the resolutions. The word cloud shows that the top issues of shareholders as found in the resolutions are lobbying, disclosure, climate, risk, emissions, sustainability, and disclosure (indicated in the center of the map). Therefore, a key takeaway is that shareholders in companies understand the significance and impact of sustainability and environmental issues like gas emission. In fact, they want top management to pay attention to these matters.
presents the top 35 words in the shareholder resolutions’ reports. The WordCount function gives a frequency count of how often a word occurs. The most frequent words include report, shareholder, lobbying, climate, and risk. This confirms the high-level sustainability issues that are displayed in the word cloud in Figure 5
. However, keywords by themselves do not contribute much to a deeper understanding of shareholders’ interests and motivations regarding sustainability.
Therefore, we next examined the co-occurrence of words. In linguistics, co-occurrence represents the likelihood of occurrence of two terms in a certain order alongside each other within a large corpus of data. In this sense, it is used as an indicator of semantic closeness of terms [114
]. This model gives insight as to which issues are related.
displays the result of the top 30 bigram, showing the frequency of two words coming together in the data set. Interesting patterns emerge from this analysis. For instance, “climate” and “change” are found together 2300 times. This indicates that shareholder resolutions are centered around climate change. Similarly, the combination of the top five bigrams shows that stakeholders are attempting to make positive changes on issues like climate change and greenhouse gas through both direct and indirect lobbying.
presents the top 30 results of trigram, showing the frequency of three words used together in the data set. Resolutions appear to focus on how and in what direction shareholders are influencing the sustainability landscape in the context of companies. Shareholders are communicating their sustainability activism via direct and indirect lobbying, with a goal of mitigating problems, including greenhouse gas emissions, within reasonable cost.
Next, we generated row similarities using the term frequency-inverse document frequency (TF-IDF) technique. This algorithm searches through a corpus of documents and determines which words are favorable to use in a query. For each word in a document, a value is calculated as an inverse proportion of the frequency of occurrence of the word in the document to the percentage of documents the word occurs [115
]. A high TF-IDF indicates a strong relationship with the document in which it occurs. This implies that if the word were used in a query, then the document would be relevant and of interest to the user [115
]. This technique ensures efficient query retrieval with relevant words. The row similarity analysis indicates the parallels between resolutions. Cosine-similarity, a branch of row similarity, calculates a numeric quantity to denote the similarity between groups of text [115
]. Different cosine-similarity values represent angles and varied orientation between the two documents. Here, we define a function that takes a resolution title as input. It then outputs a list of the 10 most similar resolutions by implementing the concept of cosine-similarity on the resolutions. Figure 9
shows the approach to identifying the top resolutions that are most relevant. For example, keywords and labels of similar resolutions are generated when a resolution pertaining to renewable energy is input to the model. This offers both shareholders and top management the insight into a resolution’s voting outcome based on other resolutions.
We next explored the potential to identify and organize the key climate change and sustainability issues shareholders are most concerned with by applying the techniques of clustering and classification in an exploratory way. Clustering helps identify the keywords that are central to the resolution; classification can help classify new resolutions into these keywords. The K-means algorithm was applied to generate the clusters. Figure 10
shows the number of keywords in each cluster.
Cluster 1, as shown in Figure 11
A, has 449 sustainability resolutions with terms like carbon, climate change, emission, energy, methane, renewable, etc. This indicates that the resolutions in this cluster are associated with emission and energy. Therefore, one could tentatively label this cluster as “emission and energy.” This cluster is tentatively termed “goal.” Activism has primarily been driven by an approach where investors choose one topic, such as climate change or diversity [58
The second cluster (cluster 2), shown in Figure 11
B, has 292 resolutions. This cluster identifies key concerns of shareholders like diversity (gender, race, etc.), board, committee, compensation, executive pay, voting, etc. Therefore, this is grouped under the general term of “board.”
The issues in cluster 3 (Figure 11
C), with 193 resolutions, are primarily communication, legislation, lobbying, oversight, etc. These tend to reflect “regulation”. Cluster 4 (Figure 11
D), with 106 resolutions, emphasizes the communication of sustainability ideas via lobbying with the government, campaigning, disclosure, expenditure, transparency, etc. This results in the possible cluster label of “politics”.
Cluster 5, as displayed in Figure 11
E, has 42 resolutions. It refers to issues like by-law, ESG, governance, metric, managing, proxy, etc. Overall, this cluster appears to describe “governance”. The governance issues in our sample comprise primarily political lobbying, corruption, and board oversight regarding environmental and social issues [58
]. Therefore, this cluster is related to cluster 2, namely, “Board”.
Next, cluster 6 (Figure 11
F) mentions a variety of products and services at the core of climate change and sustainability. These include deforestation, drugs, food, packaging, plastic, recycling, etc. The products, and their related activities, are essential to human life. However, the process of making them can be improved through sustainability practices like proper recycling and bio-degradable packaging. Therefore, one may assign it the label “Product Management”.
The final cluster, cluster 7 (Figure 11
G), has 218 resolutions. It mentions a variety of core concepts of climate change, ESG, and sustainability practices within companies. Its key words include guidelines, environmental, reporting, responsibility, sustainability, etc. Overall, this cluster can be labeled “Accountability”.
Cluster analysis focuses on data position and the visual structure of the data within a cluster. This allows us to pinpoint important nodes (in this case, a keyword) and key findings. Next, we turn our attention to classification. This allows us to engage in predictive analytics. Given the resolution, stakeholders may ask if one can accurately classify the voting status of the resolution.
In this exploratory research, the K-nearest neighbor algorithm is used to build the classification model. Figure 12
presents the number of neighbors (x
-axis) and model accuracy (y
-axis). Testing accuracy peaks at 61% with five neighbors, indicating the model can predict the voting status of a given resolution with 61% accuracy. With further iteration and refinement of the models, this can be improved further.
Increasingly, activist shareholders are attempting to pass resolutions on sustainability to gain the attention of management. By proposing and putting them to vote, shareholders are bringing attention to climate change, ESG, and sustainability causes. Therefore, shareholder resolutions are a key source of information on the shareholders’ sentiments regarding sustainability. By extrapolation, management and other stakeholders can gain insight into the shareholders’ sentiments.
This exploratory study used several machine learning-based methods in text analytics to analyze and elicit key information from 1737 shareholder resolutions over a 11-year period. The research capitalizes on advances in information processing technology in extracting insight from large corpuses of text, a process that was only entrusted to subjective evaluation in the past. In the process, and in line with our research question, we identified seven key clusters (topics) related to sustainability, representing shareholders’ main concerns. Our analysis shows that shareholders are concerned about topics related to emission and energy, boards, regulations, politics, governance, product management, and accountability. The identification of these key topics is consistent with results from other studies related to ESG and sustainability reports, typically from a managerial perspective. It therefore appears that both shareholders and management are interested in similar themes.
Analyzing shareholder resolutions allows a window into the landscape of issues that shareholders are passionate about. Analyzing shareholder resolutions also helps companies foresee the evolution of these issues in the public eye. In addition, it helps companies tell whether they are likely to become targets of public scrutiny related to the issues [79
This exploratory study attempts to gain insight into the climate change and sustainability concerns of shareholders by examining shareholder resolutions. However, there are many other issues warranting investigation. For example, to what extent can text documents such as the resolutions be used to forecast sustainability trends and managerial action. Trend analysis can be performed to examine attitudes and impact over time. The impact of resolutions on the public in general and vice versa can be examined from a social media perspective (public sentiment analysis). Reproducibility and validation are key challenges in the application of machine learning and text analytics. Additionally, the intuitive labeling of clusters and interpretation of the machine learning likely make the interpretation subjective. Yet, we are confident that the results of the analysis, for example, the labeling of the clusters, generally represent what is described in the resolutions, as confirmed by prior literature. Thus, it provides a powerful methodology to study and understand sustainability. Additional limitations include the reliability of the documents and validity of data preparation techniques. Generalizability of the topics (clusters) may be uncertain by examining a small sample of resolutions. They may not characterize overall corporate efforts and initiatives regarding sustainability. Further, machine learning models are only capable of extracting a limited amount of insight. Additionally, this study used the summary section as the sole source of unstructured data. Future research should investigate the value of certain sections on a company’s 10K reports and user-generated content (e.g., tweets, blog postings, etc.).
Despite the limitations, our study contributes to policy and research in two ways. First, practitioners and researchers can utilize the results to prioritize sustainability initiatives and examine some of the clusters via different lenses like those of non-governmental organizations (NGOs) or other activists. Second, the study demonstrates the efficacy of machine learning and text analytics in understanding and gaining insight into shareholder resolutions to make informed decisions by conducting descriptive and predictive analytics. Third, we demonstrate the evolutionary trend of how shareholder resolutions can be an effective venue for fostering changes in corporate practices by corporate management.
Shareholders, being a large and prominent group of stakeholders, can effectively project a positive image of the company with transparency. In fact, regardless of whether a resolution is passed or implemented, the resolution being publicly available creates a positive impact on the dialogue regarding corporate sustainability. Future research can continue to explore and apply advanced techniques like deep learning to delve into shareholder resolutions and disclosure reports for additional content analysis. For example, prescriptive analytics can be investigated to not only predict outcomes, but also suggest likely impacts and potential strategies.
Other avenues for future research include conducting cross industry comparisons, studying global differences, and examining resolutions that affect corporate sustainability and its correlation to company performance. Sophisticated application of advanced techniques, such as artificial intelligence and deep learning, will accelerate the maturing process of gaining insight from shareholder resolutions on climate change and sustainability. Even though most proposals of shareholders in the domain of ESG obtain below 20% votes in support [58
], these proposals nevertheless serve to be significant catalysts of action within companies [116
] and persuade management to take effective action. Additionally, applying discovery analytics on the resolutions may shed light on innovation and new product ideas.