Analyzing Sustainability Reports of Global, Public Corporations by Industrial Sectors and National Origins

: Due to the demand by stakeholder groups of global public corporations for greater transparency in business operations, corporations have continuously tried to embody the concept of sustainability in their business strategies and operations. That is, they have collectively published sustainability reports to state their progress toward achieving sustainability goals. However, understanding of the thematic and conceptual structures of environmental and sustainability reports of the global public corporations is still limited. In this study, the author identiﬁed key thematic attributes through text data mining analysis: (a) sustainability, (b) energy, (c) approach, (d) environmental, and (e) people for industrial sectors. Additionally, themes of (a) business, (b) employees, (c) ﬁnancial, (d) energy, and (e) suppliers appeared most frequently and were the top ﬁve compounding themes for overall national origins. In fact, the majority of ﬁndings pointed out that these themes and concepts have limited environmental and climatic relevance, as they only align with certain goals, such as UN SDGs 12, 13, and 14. That is, understanding key factors of their sustainability reports is crucial toward overcoming the challenges of achieving SDGs. Furthermore, these ﬁndings in accordance with industrial sectors and national origins of the global public corporations can help to derive a more in-depth understanding of current reports on environmental and sustainability-driven business operations.


Introduction
The number of global public corporations which publish information on sustainability issues has been steadily increasing, especially in the last two decades [1][2][3]. The fundamental purpose of producing and publishing this type of information is to inform their stakeholder groups as to their social, environmental, and governance activities. This happens because global public corporations are also engaged in the adoption of sustainability practices due to potential regulatory pressures from the governing bodies of international standards [1][2][3]. Without consideration of sustainability issues for the next generation, excessive competition among these corporations solely for economic profitability would lead us to resource depletion and failures in effectively controlling waste, air pollution, and water resources [4][5][6][7]. For instance, in 2015, the United Nations (UN) presented the SDGs, which consist of 17 main goals that address 169 detailed goals and 241 indicators as a potential mechanism for monitoring sustainability progress [7]. The 17 SDGs have been regarded as an action plan for organizations, including public corporations, to deal with the ongoing common goals of sustainable development [8,9]. These corporations have carefully considered the potential demands of all stakeholders, including shareholders, consumers, internal employees, communities, and associated public agencies for greater transparency on social and sustainable issues [2,8,10]. Investment in environmental, social, and corporate governance (ESG) activities for sustainable growth is essential to maintain a socially responsible image, which impacts financial performance [11]. Global public corporations have engaged in environmental issues due to increasing interest in social responsibility since the late 1990s, resulting in the publication of advanced sustainability reports in recent years [8,10,11]. These socially responsible activities are implemented primarily by large/multinational public corporations with global market shares [12][13][14]. Since global public corporations can only achieve their mission and goals through multi-stakeholder collaboration and transactional partnerships with public entities, their behaviors in the global market can be critical role to achieving the Sustainable Development Goals (SDGs). In this regard, publication of sustainability reports has become one of the critical communication options to make the global public corporations' sustainability activities visible for competitive market advantages [15]. However, global public corporations have only applied select components of sustainable practice on a macro level to increase sustainable development while considering the associated ecosystem and climate impacts [10,16].
The degrees of effort and key issues identified for reporting the progress of sustainable business operation differ from sector to sector and also by national origins [1,[16][17][18].
As presented in Figure 1, the SDGs can only be achieved when economic development, social inclusion, and environmental sustainability are well balanced by both private and public organizations [19].
cial, and corporate governance (ESG) activities for sustainable growth is essential to maintain a socially responsible image, which impacts financial performance [11]. Global public corporations have engaged in environmental issues due to increasing interest in social responsibility since the late 1990s, resulting in the publication of advanced sustainability reports in recent years [8,10,11]. These socially responsible activities are implemented primarily by large/multinational public corporations with global market shares [12][13][14]. Since global public corporations can only achieve their mission and goals through multi-stakeholder collaboration and transactional partnerships with public entities, their behaviors in the global market can be critical role to achieving the Sustainable Development Goals (SDGs). In this regard, publication of sustainability reports has become one of the critical communication options to make the global public corporations' sustainability activities visible for competitive market advantages [15]. However, global public corporations have only applied select components of sustainable practice on a macro level to increase sustainable development while considering the associated ecosystem and climate impacts [10,16].
The degrees of effort and key issues identified for reporting the progress of sustainable business operation differ from sector to sector and also by national origins [1,[16][17][18].
As presented in Figure 1, the SDGs can only be achieved when economic development, social inclusion, and environmental sustainability are well balanced by both private and public organizations [19]. Reports that reflect social/environmental performance and commitments of global business operations towards sustainable development have been published voluntarily with titles such as "sustainability reports," "corporate social responsibility (CSR) reports," "social and community reports," and "environmental reports" [9]. Since there are no regulations on the style and format of such reports, the quality assurance of reporting became a controversial issue among these corporations [8][9][10][11]. The Global Reporting Initiative (GRI) is a network-based non-profit organization involved in the communication of social, environmental, and sustainability reports [19][20][21][22]. The majority of corporations that published a sustainability report since followed the reporting standards by GRI, available in 12 different languages [7,10,[20][21][22]. Other standards for assessing the performance of Reports that reflect social/environmental performance and commitments of global business operations towards sustainable development have been published voluntarily with titles such as "sustainability reports", "corporate social responsibility (CSR) reports", "social and community reports", and "environmental reports" [9]. Since there are no regulations on the style and format of such reports, the quality assurance of reporting became a controversial issue among these corporations [8][9][10][11]. The Global Reporting Initiative (GRI) is a network-based non-profit organization involved in the communication of social, environmental, and sustainability reports [19][20][21][22]. The majority of corporations that published a sustainability report since followed the reporting standards by GRI, available in 12 different languages [7,10,[20][21][22]. Other standards for assessing the performance of these reports are also available, such as the AA1000 Assurance Standard by "Account Ability", and "ISAE 3000" by the International Standard on Assurance Engagements [10,21,23]. The GRI has been considered as one of the most reliable providers of sustainability disclosure standards and reports [24]. Major metrics, such as the FTSE4Good and Dow Jones Sustainability indexes [12,13], have been designed to measure the sustainability-related business practices of public corporations (e.g., public trading is indexed in the S&P Dow Jones Indices) in terms of their progress and achievements regarding the defined ESG crite- ria [13,14,20,25]. In fact, most of the previous research on corporate sustainability reports has substantiated the role of sustainability reports as communication and public-relation tools that influence the long-term benefits of organizational resilience, governance, and equity perceived by various stakeholders [8,10,11,22]. However, empirical analysis of the contents of sustainability reports remains limited.

Research Problem
The fundamental goal of this study was to analyze potential themes and relevant concepts included in the sustainability reports of global public corporations through a novel text mining technique. For this research, there were two research questions: (1) What are the overall thematic and conceptual structures of the sustainability reports based on industrial sectors? (2) What are the overall thematic and conceptual structures of the sustainability reports based on national origins of the global public corporations? In other words, the author attempted to explore the semantic relationships among these themes and concepts of 200 global corporations' reports selected from the "Forbes Global 2000 List" [23]. The author also tried to analyze the overall thematic and conceptual structures of the sustainability reports and to verify the differences in the reports according to industrial sector and national origin.

Text Data Source Selection
Fundamental data sources for this analysis were selected from the "Forbes Global 2000 List" of 2020. This list has been published annually by Forbes since 2003 and ranks the largest global public corporations based on four financial metrics: (1) sales, (2) profits, (3) assets, and (4) market value [23]. This list is one of the most reliable indicators of the financial performance and business operations of leading public corporations [23]. Based on the list composed of 10 industrial sectors, a total of 1863 official corporation websites were visited, and finally, 10% of the listed corporations were selected by applying a stratified sampling method (i.e., selecting 10 corporations for each industrial sector). Furthermore, the 200 selected public corporations were categorized into eight different national origins. Any nation with less than three corporations was excluded in the final analysis due to the data insufficiency for the data mining process. To ensure comprehensiveness of relevant reports on sustainability or sustainable issues of business operations, the document archive of each corporation was searched using the keyword "sustainability" or "sustainable". Relevant sustainability reports were retrieved and transformed into analyzable formats for coding and analytic procedures ( Figure 2). Additionally, the sustainability report archives of GRI were also visited. It was found that the report archives of the GRI can effectively request and store the most up-to-date reports for its users and report providers.

Coding of Text Data
Two coders independently searched and retrieved relevant documentation by screening content in accordance with the following review criteria. During the review process, documents were excluded if (1) reports were not written in English, (2) the primary focuses of documents were not related to sustainability goals and agendas, or (3) the documents contained only a little relevant content. If there was any disagreement on report selection, coders discussed internally whether the report satisfied these basic criteria. The final inclusion criteria were as follows: (1) the report must not only contain financial information (e.g., annual performance reports), and (2) the report must concern environmental, social, and governance business practices, or must concern strategic plans and annual outcomes of a commitment to the 17 SDGs. After the retrieval of physical text data, an Excel spreadsheet for the selected corporations was generated that included the details of the reports and the financial performance and characteristics of each corporation. Most importantly, the selected corporations were categorized based on (1) industrial sector and (2) national origin.

Data Analysis
The collected text data were analyzed using the text data-mining technique via Leximancer version 5.0 (Leximancer Pty Ltd., Brisbane, Australia). Leximancer is a widely accepted analytic tool for text and is able to automatically build a thesaurus of words and phrases, which are then transformed into concepts based on contextual similarities and themes based on groups of concepts [26][27][28]. The tool employs semantic and relational extraction on different strata of text data [28]. In other words, the concept and thememapping algorithm, which is derived from Bayesian decision theory, is based on hierarchy of appearance and relational extraction [27]. The algorithm automatically counts words and phrases that co-occur and detects significant inter-relationships and semantic linkages among concepts through non-linear dynamics and machine learning methods such as clustering [29,30]. Then, a visual map is generated depicting the thematic and conceptual relationships among words and paragraphs of any given text data. This tool employs nonlinear dynamics and machine learning, which derive asymmetric concept cooccurrence text data in clustering concepts. Note that the relative co-occurrences of words

Coding of Text Data
Two coders independently searched and retrieved relevant documentation by screening content in accordance with the following review criteria. During the review process, documents were excluded if (1) reports were not written in English, (2) the primary focuses of documents were not related to sustainability goals and agendas, or (3) the documents contained only a little relevant content. If there was any disagreement on report selection, coders discussed internally whether the report satisfied these basic criteria. The final inclusion criteria were as follows: (1) the report must not only contain financial information (e.g., annual performance reports), and (2) the report must concern environmental, social, and governance business practices, or must concern strategic plans and annual outcomes of a commitment to the 17 SDGs. After the retrieval of physical text data, an Excel spreadsheet for the selected corporations was generated that included the details of the reports and the financial performance and characteristics of each corporation. Most importantly, the selected corporations were categorized based on (1) industrial sector and (2) national origin.

Data Analysis
The collected text data were analyzed using the text data-mining technique via Leximancer version 5.0 (Leximancer Pty Ltd., Brisbane, Australia). Leximancer is a widely accepted analytic tool for text and is able to automatically build a thesaurus of words and phrases, which are then transformed into concepts based on contextual similarities and themes based on groups of concepts [26][27][28]. The tool employs semantic and relational extraction on different strata of text data [28]. In other words, the concept and theme-mapping algorithm, which is derived from Bayesian decision theory, is based on hierarchy of appearance and relational extraction [27]. The algorithm automatically counts words and phrases that co-occur and detects significant inter-relationships and semantic linkages among concepts through non-linear dynamics and machine learning methods such as clustering [29,30]. Then, a visual map is generated depicting the thematic and conceptual relationships among words and paragraphs of any given text data. This tool employs nonlinear dynamics and machine learning, which derive asymmetric concept co-occurrence text data in clustering concepts. Note that the relative co-occurrences of words and phrases were depicted as equivalent thematic and conceptual networks in Lexi-mancer's concept map [30]. Since a text source and its segments are coded and analyzed using a computer-aided analytic process, stability (i.e., coder reliability) is not important for this program. Words that share contextual similarity are considered as "concepts" under the domain of the Leximancer program. Moreover, clusters of "concepts" are transformed or grouped as "themes" with relative titles. One of the most unique features of the analytic tool is its ability to demonstrate a concept map. The following are the important features of the concept map: (1) a color represents the level of importance of each theme (i.e., red indicates the most connectivity, and purple is the least connected theme), and (2) the size of a theme is an indicator of the simultaneous appearance of a concept with other concepts when using text mining analysis.
In a visual concept map of Leximancer program, the clustered concepts are called "themes". Each theme is named for the most connected concept within a colored circle, and the size of a theme circle reflects its connectivity with other concept dots [26,27]. The Leximancer program produces a dashboard report containing statistical results of the text analysis. The key indicators of the report include (1) "hit count", which is a simple frequency indicator of a certain word-like concept presented in a dataset; (2) "relevance", which refers to the co-occurrence of word-like concepts and is estimated as a percentile rate; and (3) "connectivity", which is the sum of all the text co-occurrence counts of any given concept with every other concept on the concept map (Leximancer user guide, release 5.0). This report is a quantitative overview of the Leximancer's concept map and presents thematical and conceptual similarities and differences among sourced text data. Group clusters of concepts are "themes", varying from hot colors (red and orange) to cool colors (blue and green).

Results
The study was designed to conduct thematic and conceptual analysis of the reports of select corporations. Common themes and concepts across all reports were revealed through a semantic extraction process and are presented in Figures 3 and 4. The concept map developed using the content analysis software Leximancer is a graphical representation of the themes and concepts of a data source, where size and color indicate the relevance of a theme or concept, and the degrees of mathematical connectedness among words are identified [26]. Moreover, this concept map also shows the relative sizes and colors concerning the relevance of the themes [26,27] (i.e., presented in terms of temperature; red means the hottest temperature of a theme bubble, the highest frequency). That is, the connectivity score refers to the degree of similarity among different themes. The Leximancer analysis identified a total of six significant themes across industrial sectors, among which "sustainability" had the strongest significance in terms of hit count and connectivity, followed by "energy", "approach", "environmental", "people", and "equipment". "Sustainability" was observed to be the strongest concept, (i.e., it had the highest relevancy rate, followed by "management", "social", "environmental", and "governance"). The key thematic and conceptual structures of each industrial sector are highlighted in Table 1. HC: hit count; RR: relevancy rate. The relevancy rate is the percentage frequency of text segments coded with that concept from the analysis of each industrial sector; all themes analyzed by the Leximancer program are presented in the row "Theme"; less than 10 word-like concepts are listed in the row "Concepts".
For the overall national origin, six corporations per nation were considered for the analysis. Across national origin, "business", "employees", and "financial" were identified as the top three compounding themes, followed by "energy", "suppliers", "assets", "value", "members", and "employees".
Moreover, the concept of "information" had the highest relevance rate, followed by "environment", "report", "data", "employee", "CO 2 ", and "environment". Notable differences in identified themes and concepts were identified based on national origin. The analysis results on prominent concepts for each national origin are reported in Table 2.  Emissions (19) CSR (14) Financial (13) Concepts (

Discussion
Due to the innovations in data-analytics, the author was able to systematically analyze, exploit, and synthesize large quantities of text data. Such data in text form are a rich and fundamental source of information that allows the comprehension of social behavior in humans and organizations [27][28][29]. Thus, the findings of this study present fresh insights into the nature of corporate sustainability reports and how these reports differ across industrial sector and national origin. For the variable of industrial sector, no significant agreement among the sectors could be found because of the unique features of their business operations and internal/external stakeholder groups. Nevertheless, the relevant concepts of environmental sustainability appeared throughout the reports of the selected corporations.

Discussion
Due to the innovations in data-analytics, the author was able to systematically analyze, exploit, and synthesize large quantities of text data. Such data in text form are a rich and fundamental source of information that allows the comprehension of social behavior in humans and organizations [27][28][29]. Thus, the findings of this study present fresh insights into the nature of corporate sustainability reports and how these reports differ across industrial sector and national origin. For the variable of industrial sector, no significant agreement among the sectors could be found because of the unique features of their business operations and internal/external stakeholder groups. Nevertheless, the relevant concepts of environmental sustainability appeared throughout the reports of the selected corporations.
For instance, the reports of producers of consumer products (e.g., material, semiconductor, aerospace, and construction industries) focused on issues such as "energy", "CO 2 ", "emissions", and "waste". These are only relevant to the environmental and climateoriented areas of SDGs 12,13,14,and 15. It is true that these SDGs are basic safeguards against unsustainable commercial activities and ecological collapse [13,14,[19][20][21][22]. In 2020, more than 260 global public corporations participating in the global initiative named RE100 led by the Climate Group and CDP have made a commitment to replace conventional electric power sources with 100% renewable power sources for their global business operations by 2050 [29][30][31][32][33]. The corporations are obligated to report annual progress toward renewable energy sourcing and progress towards achieving a zero carbon grid [32]. Global public corporations from a wide range of industrial sectors and across national origins have been joining this initiative. For instance, in April 2018, Apple announced that its global facilities across 43 countries are run on 100% renewable electricity. Samsung Electronics, the technology hardware corporation selected for this study, has been operating a cell phone application called "Samsung Global Goals" jointly with the United Nations, to raise awareness of the 17 SDGs. The application, installed on every smartphone that is sold, generates consumer interest in the 17 SDGs. It also raises donations for the United Nations Development Program (UNDP) through advertisements.
The results from the global public corporations show that sustainability remains an interdisciplinary concept that has to be cohesively defined, communicated, and implemented to achieve the 17 SDGs [34 -37]. Moreover, the themes and concepts identified in this study differ from sector to sector due to the unique circumstances of their business operations and multi-stakeholder partnerships. For instance, the findings concerning industries that are more service-oriented than technology-oriented (e.g., banking, diversified financial industries, insurance, media, and retail) indicated stronger sensitivity to satisfying potential customers' needs. In fact, concepts such as "health", "customers", and "social" are prominent in the reports from these sectors.
The analysis results based on national origin provided a comprehensive review of the role played by this variable in the structure of corporate sustainability reports. For instance, the reports of German corporations presented the highest diversity of themes (12 different themes) and focused on person-centered themes (i.e., customers, services, employees, and governance). The reports of the corporations from UK, USA, and Korea identified 11 new themes with relevant concepts. The least number of themes emerged from the reports of Japanese corporations; however, these reports focused on the issues related to employee engagement (i.e., human resources, employees, and support). Japan is known for having the highest percentage of corporations issuing sustainability reports, followed by US, France, and Germany. On the other hand, among the five national origins discussed in the study [10], China had the lowest percentage of report publication. Although there has been increasing interest in the domain of sustainability research, especially during the last decade [7,10,14,16,38], sustainability is still a complicated construct. In practice, the industrial sectors, business owners, scholars, and public/governmental agencies have varying views on the definition of sustainability [21,22,31,33]. In particular, global corporations have not agreed upon a common level of recognition of the 17 SDGs [8,10,16]. This lack of connectivity between the reports of corporations and the 17 SDGs was confirmed by the findings of this study. Despite the 17 SDGs serving as a fundamental framework for corporate reports [21,22,31,33], certain environmental issues such as climate change, clean water and energy, CO 2 emissions, and pollution are strictly regulated. As indicated in Sustainable Development Goals Progress Chart 2020, SDGs related to climate change and ecosystems (i.e., SDG13−15) require global and cohesive support. However, for other issues of the 17 SDGs (i.e., no poverty, equal opportunity for both genders, education, well-being, human rights, resilient industry and infrastructure, and sustainable cities/communities) under the 2030 agenda, there are notable differences in the pace of progress depending on geographical location and economic status. For instance, for SDG1 and SDG2 (i.e., ending poverty and achieving food security), there are differences in the level of progress between developed and under-developed countries, such as Sub-Saharan Africa [33,34].
Regardless of the industrial sector and geographical region under which a global public corporation falls, the commercial activities of public corporations will always play a major role in creating and supporting relevant values, policies, and regulations among all stakeholder groups. Therefore, corporations must refine, address, and communicate those insufficiently reflected areas of the SDGs to better contribute to resolving global crises. Ideally, a sustainability report should be prepared based on the empirical demands of various stakeholder groups of corporations and emergent issues of future generations [35][36][37]39]. To achieve the foremost mission of SDGs, the global public corporations should continue their efforts regarding sustainable practices beyond existing regulations and standards. In other words, global public corporations should also pay greater attention toward creating organizational value and a culture for the advancement of sustainability practices. More practically, global public corporations should invest financial resources to educate their employees and to communicate sustainability-related organizational value with their external stakeholders [1,4]. The publication of sustainability reports should not be merely a formality to meet the demands of the stakeholder groups and to deal with external regulatory pressures [1,3,[34][35][36][37].

Conclusions
To work towards these goals described, longitudinal and empirical data should be collected and analyzed regarding the actions of global public corporations to redefine theoretical and practical frameworks for future reports [39][40][41][42][43]. This has the potential to result in better connectivity between the contents of sustainability reports and the 17 SDGs.
The text-mining analysis performed in this study illustrated the thematic and conceptual structures of the sustainability reports of global public corporations. The text-mining tool Leximancer might be one of the few tools that automatically performs delicate and time-consuming procedures of content analysis while systematically minimizing human bias and errors [24,25,[43][44][45]. Despite this methodological advantage, the theoretical and practical frameworks of sustainability reports are still not clear, and more research is needed to optimize strategies for sustainable business operations [38].
This study has some limitations. First, only sustainability reports written in English were compiled for the analysis, due to that limitation of Leximancer. Many reports of China-based corporations were written in Chinese; these reports were not included in the analysis. Secondly, considering the national origins of the corporations, the generalizability of these findings is limited because of the unequal numbers of corporations located in different countries and the varying distributions of industrial sectors in different countries. Thirdly, the findings of this study do not account for the opinions or judgments of external stakeholder groups. There has been a rapid increase in the demand for public engagement in the environmental, social, safety, and managerial issues involving corporations. As a consequence, under competitive business scenarios, global public corporations must be effective and efficient communicators to satisfy the demands of sustainable management. Future sustainability reports should be based on empirical data on the activities of enterprises, instead of solely reporting qualitative data. The reports should be able to demonstrate multi-pronged efforts at the macroscopic level for cooperation among various stakeholder groups. For future research, it might be interesting to verify the potential gap between the internal and external opinions regarding corporate sustainability by adopting mixed method research designs (i.e., a comparison between empirical text data from corporations' reports and consumers' perceived opinions of the corporations' sustainability performances).
Funding: This research received no external funding.
Institutional Review Board Statement: Ethical review and approval were waived for this study since no personal data was involved in this study.
Informed Consent Statement: Not applicable since no human subjects were involved in this study.

Data Availability Statement:
The datasets generated during and/or analyzed for the current study are available from the author on reasonable request.