The Concept of Sustainability on Social Media: A Social Listening Approach

: The concept of sustainability has gone far beyond the issues of the sustainable management of natural and environmental resources. Nowadays, sustainability is part of the social sciences in a di ﬀ erent way. The aim of this research was dual. Firstly, we analyzed the di ﬀ erent contexts and areas of knowledge where this concept is used in society by using social listening on Twitter, one of the most popular social networks today. The sentiments of these conversations were rated to assess whether the feelings and perceptions of these conversations on the social network were positive or negative regarding the use of the concept. Also, we tested if these perceptions about the topic were attuned to other more formal ﬁelds, such as scientiﬁc research, or strategies followed nationally or internationally by agencies and organizations related to sustainability. The method used on this ﬁrst part of the research consisted of an analysis of 15,000 tweets collected from Twitter using natural language processing (NLP) for clustering the main areas of knowledge of topics where the concept of sustainability was used, and the sentiment of these conversations on the social network. Secondly, we mapped the social network of users who generated or spread content regarding sustainability on Twitter within the period of observation. Social network analysis (SNA) focuses on the taxonomy of the network and its dynamics and identiﬁes the most relevant players in terms of generation of conversation and also their referrers who spread their messages worldwide. For this purpose, we used Gephi, an open source software used for network analysis and visualization, that allows for the exploration and visualization of large networks of any kind, in depth. The ﬁndings of this research are new, not only because of the mix of technology and methods used for extracting data from Twitter and analyzing them from di ﬀ erent perspectives, but also because they show that social listening is a powerful method for analyzing relevant social phenomena. Listening on social networks can be used more e ﬀ ectively than other more traditional processes to gather data that are more costly and time consuming and lack the momentum and spontaneity of digital conversations.


Introduction
Sustainability is the quality of the sustainable. This is expressed in the Cambridge Dictionary: "the quality of being able to continue over a period of time". Sustainability is therefore the quality of temporal continuity, without altering its specific properties, and can be associated with multiple natural, social, political, or economic phenomena. The most widespread and well-known concept associated with sustainability is "sustainable development", introduced by the UN in 1987 [1]. However, the idea of sustainability is much older. For instance, the concept was used in relation to Young's [2] discussion of the enclosure movement in the early 19th century [3], in which sustainability and productivity were associated with the development of private institutions and initiatives. Similarly, sustainability also appears as a novel criterion in the use of forest resources, through principles of sustainable management [4,5]. Sustainability also appeared as a when the term "maximum sustainable production" was coined in Schaefer's study of fisheries [6].
In addition to its use in the management of natural and environmental resources, the concept of sustainability has been used frequently in the social sciences. For instance, in the social science literature we found sustainable competitive advantage, corresponding to "an advantage that allows a business to be more successful than its competitors over a long period of time" (https://business_finance. enacademic.com/). Similarly, public debt is considered to be sustainable, and economists introduce sustainability rules to make it so. Very specific problems of economic policy, such as dollarization, are seen as a problem of "fiscal sustainability" [7]. With reference to the foreign sector of the economy, external sustainability is also required [8]. In the same way pension sustainability seeks to allow pensions to be maintained in the long run, just as the welfare state has to be sustainable in the long run, which in turn will require its own more general and complex rules [9].
However, since 1987, sustainability, in relation to businesses, has taken on the sense of being careful with the use of natural resources, beyond the relative scarcity expressed through prices [10]. There is also now a certain circularity in the use of these resources that refers to tasks not only of recycling but of transformation [11].
It is interesting to note, for example, how the processing industry has adapted to the concept, especially the one based on combustion processes, such as castings (metal casting). Thus, an editorial of the American sector journal referred to the speech of the president and CEO of ThyssenKrupp Waupaca at the 114th Metal Casting Congress in the United States, where we can read that the sustainability was already concerned not only with the survival of the planet, but also about the survival of the country and the sector, for which it proposed to strengthen the use and development of "green technologies". Similarly, in economics, we found the concept of sustainable finance that "emphasizes the environmental development and institutional aspects of the investment that promotes sustainable growth" [12].
We even found the development of indices such as those of the Dow Jones Sustainability Indices (DJSI), developed from 1999 in a collaboration between S&P Dow Jones and RobecoSAM (https://www.robecosam.com/csa/indices/djsi-index-family.html), which pointed out the extent to which a company will generate long-term value to the investor depending on the management of those resources that are more or less sustainable in terms not only economic, but also environmental and social. In a similar way, there is a growing concern in the economic world to turn the company's organizational culture into "sustainable sustainability" [13], in which the concrete proposal for sustainability is articulated as a central part of the strategy to differentiate itself from other organizations, forcing the reporting not only in financial terms but also in social and environmental terms.
There are many more examples of the increasing use of sustainability and the sustainable as elements of scientific research and social and economic behavior. However, there is little research, to our knowledge, on how people currently talk, perceive, and interact with the concept of sustainability on social networks and whether their behavior is oriented towards the type of action that we call sustainable.
In an attempt to understand these issues, the work that follows investigated how sustainability and sustainable terms are handled on social networks and what this entails. This research also tried to find out to what extent this perception on social media is aligned with what scientists do when they research sustainability and sustainable.
The second section is a literature review and states our hypotheses. We contrast the use of sustainability and sustainable terms in social networks with the world of scientific research. The third section provides a description of the Twitter data collection and the methods applied to analyze the tweets. In a fourth section, the empirical analysis of the results is carried out, followed by the conclusions.

Literature Review and Research Hypotheses
Sustainability is a complex, dynamic, and interdisciplinary problem. This forces the development of what is called a "transdisciplinary approach" [14]. It is important to understand how social and natural sciences must "embrace" [15] to advance the ultimate goals of this complex field of sustainability, but also to understand how society perceives these objectives and what behaviors it considers necessary to modify for this purpose.
A very positive step in this direction is seen in the work related to the scientometric review of the research that is being carried out throughout the world in relation to sustainability and sustainable development [16]. In this case, the research methodology has sought to map scientific knowledge and understand the structure and development that scientific researchers, faculties, countries, and scientific journals have followed with the choice of topics, terms, and trends in sustainability studies.
With 2094 bibliographic records, the approach followed by [16] is based on analysis of coauthorship, cocitation, or word use analysis, as well as performing clustering analyses based on geospatial analysis, generating networks of authors and scientific journals, from the Keyhole Markup Language (KML) files and the use of Google Earth.
This type of analysis, based on a broad semantic conception, or a network format, had already been used in other coauthoring analyses or publications in other areas, such as applied mathematics and physics [17], just as the art of "navigating" hypertexts had previously been introduced [18].
What is new in this scientometric analysis is their focus on sustainability and sustainable development, demonstrating that in international research on these topics, there are institutions that are intensifying this research area, such as the Chinese Academy of Sciences, with more than 65 articles in a brief interval of time. Also, there appeared to be a clear citations burst by country and periods, with a variable extension, between 1991, the initial moment, in the United Kingdom, and 2014 [16]. Likewise, using these citation bursts and some measures of centrality, fourteen keywords and themes were found to be the most referenced and influential in the realization of these articles. These themes included environment, climate change, design, policy, impact, sustainability indicator, energy or innovation. Also, the central and most impactful journals, from the Journal of Cleaner Production (impact factor of 5715) to the more general ones such as the Current Opinion in Environmental Sustainability (impact factor of 3954) allowed us also to analyze how the map of the most cited authors was formed.
No less important is the centrality study of the most influential documents, among which [10] stands out. In addition, the clustering exercise carried out in [16] allowed us to detect eleven fields of analysis with clear labels and reference documents in each of them. In short, the article provides a scientific visualization of a literature devoted to the study of sustainability and sustainable development, based on scientometric analysis based on coword, cocitation, or geospatial clustering and analysis.
As shown by the results of [16] in the literature of the social sciences, sustainability offer us an answer to diverse challenges in mankind's use of the natural resources. The extent to which sustainability is seen in seminal works as the need to find an appropriate institutional framework [19] and, in this sense, to find novel solutions (in the semantic field of "design", "policy", or "innovation") is clear. We see, too, that there is a feeling of the need to argue production and consumption models and to reflect on the limitations of substitutability between natural resources and resources generated in economic activity, in classical contributions [20,21], which are at the heart of the problems posed by "climate change" from "energy" uses. Similarly, sustainability forces a time-critical approach [22] and material well-being is only ensured on a long-term path, where important intergenerational issues are raised [23,24], focusing a lot of current research around "design", "sustainability indicators", or "impact" or "energy" use, again, or "policy", again and again.
Within this perspective, our approach reinforces and complements other lines of research using semantic techniques and network theory around research production containing sustainability and sustainable development terms, like [16]. Our contribution is dual. Firstly, we worked on a sentiment analysis of twitter data using natural language processing (NLP) techniques, and secondly, we drew the social network that generates, interacts, and spreads messages on Twitter regarding the topic of sustainability. We focused on two main hypotheses that were relevant for the analysis of the concept of sustainability on social media.
Our first hypothesis (H1) asked if there is a semantic map that allows us to systematize social expressions related to sustainability, extending the applications of NLP and clustering based on sentiment analysis methods from previous research such as [25][26][27]. To this end, two complementary hypotheses were raised. Firstly, (H1a) stated that social media is attuned to other more formal fields, such as scientific research or strategies followed nationally or internationally by agencies and organizations of a different nature related to any aspect of sustainability. Secondly, (H1b) stated that the terms sustainable and sustainability are also mainly used in a positive context on the Twitter social network. Users generate dialogue, look for information, and spread it among the network as a way of amplifying a particular message.
The contrasting instrument for this first hypothesis was an analysis of the tweets using natural language processing (NLP) for sentiment analysis. An analysis of the relative importance of the terms was carried out, already classified according to the feeling they generate, and analyzed based on some of the most widespread theories and evidence in social sciences that have as their object the study of sustainability or the sustainable.
As a second hypothesis (H2), we used graph theory to draw the network of accounts that generate content about sustainability and what accounts interact with them. For this purpose, some of the most relevant nodes and connections were studied, providing different measures of centrality and cohesion of the network built for this purpose [28,29].

Data Collection and Methods
The analysis performed in this research had two main objectives. The first one consisted of the analysis of the tweets that contain the words "sustainable" or "sustainability" on Twitter social network and understanding what are the main topics and feelings linked to them. With this purpose a sample of 15,000 tweets was collected within thirty days, from the 23th of November to the 22nd of December 2019, which contained either of these two keywords: "sustainable" or "sustainability". After a detailed inspection of the quality of the data to verify that the tweets met with the requirements to be used in this research, 14,029 (93.52%) were considered valid.
The second aim consisted of the analysis of the network of relationships among the accounts that interact with the gathered tweets, regarding the topic of sustainability. We identified who are the ones that create them and who are the ones who retweet them and become a referrer. In addition to this, we drew the structure and hierarchy of the social communities that share content and drive the conversation.
From a technical point of view, the extraction of data was made developing our own python script that connects to the premium search tweets 30-day API that provides a rolling 30 days of access to its historical data and also allows us to apply filters for the extraction, such as date, hashtag, user account, etc. In the extraction for this research we only applied a filter using the keywords "sustainable" and "sustainability".
The gathered tweets were stored in a json file that was processed and analyzed using our own python scripts that employ natural language processing (NLP) libraries for sentiment analysis and word cloud construction. Finally, the file was also processed with the Gephi software. Gephi is an open source software for exploring and manipulating networks that was developed by Bastian, Heymann, and Jacomy [28]. In our research we used this software for the identification of communities and relationships among users that generate and interact with tweets that contain the two keywords.

Sentiment Analysis
Once the extract, transformation, and load (ETL) process was finished, we used natural language processing to perform sentiment analysis and word cloud visual representations.
The sentiment analysis stage was performed using the lexicon-based sentiment analysis and classification, one of the most popular methods for measuring the polarity of sentiment of a collection of documents (Ahuja and Shakeel, 2017). In this particular case, we applied the Valence Aware Dictionary for Sentiment Reasoning (VADER), which is a rule-based model that is able to manage a wide variety of content generated in social media and compute its sentiment polarity. More than 7,000 items and their associated sentiments were handled by VADER. The quality of its measures was validated by humans before being incorporated to the sentiment lexicon dictionary [30,31]. The robustness of VADER is very high, it does not need to train the model in advance to make the classifications, and it is computationally efficient. These are the main reasons why it is widely used in the scientific community [32]. It was found to be able to outperform individual human raters [33].
The VADER NLP method evaluates each lexical feature of a tweet written in English and calculates a metric score for the sentiment of the tweet. Later, it applies five different rules that are based on general syntactic and grammatical conventions to adjust the initial metric score. The final score ranges the sentiment's tweet from −1 (strongly negative sentiment) to 1 (strongly positive sentiment) [26].
In this research we have applied the following criteria to classify the sentiment of the tweets into two clusters: Tweets with scores ranking from −1 to −0.05 are considered negative and tweets with scores ranking from 0.05 to 1 are considered positive. Neutral tweets have scores from −0.05 to 0.05, but they are not the main focus of our research. This classification criteria avoids the bias involved in assigning the tweets with scores very close to 0 a positive or negative sentiment, minimizing the false positives and negatives [26,27].
Hence, we were able to score whether the conversations that contained the words "sustainable" and "sustainability" happened in a positive or negative context in the social network community.
According to Figure 1, the distribution of tweets by their sentiment score shows that 8,813 tweets (62.82%) had a positive sentiment, 3,061 tweets (21.82%) were considered neutral, and 2,155 tweets (15.36%) had a negative sentiment. Therefore, we can see that the conversation about sustainability (no matter what the terms "sustainable" or "sustainability" are referring to) happens in a positive or neutral context most of the time (84.64% of the tweets).

Sentiment Analysis
Once the extract, transformation, and load (ETL) process was finished, we used natural language processing to perform sentiment analysis and word cloud visual representations.
The sentiment analysis stage was performed using the lexicon-based sentiment analysis and classification, one of the most popular methods for measuring the polarity of sentiment of a collection of documents (Ahuja and Shakeel, 2017). In this particular case, we applied the Valence Aware Dictionary for Sentiment Reasoning (VADER), which is a rule-based model that is able to manage a wide variety of content generated in social media and compute its sentiment polarity. More than 7,000 items and their associated sentiments were handled by VADER. The quality of its measures was validated by humans before being incorporated to the sentiment lexicon dictionary [30,31]. The robustness of VADER is very high, it does not need to train the model in advance to make the classifications, and it is computationally efficient. These are the main reasons why it is widely used in the scientific community [32]. It was found to be able to outperform individual human raters [33].
The VADER NLP method evaluates each lexical feature of a tweet written in English and calculates a metric score for the sentiment of the tweet. Later, it applies five different rules that are based on general syntactic and grammatical conventions to adjust the initial metric score. The final score ranges the sentiment's tweet from −1 (strongly negative sentiment) to 1 (strongly positive sentiment) [26].
In this research we have applied the following criteria to classify the sentiment of the tweets into two clusters: Tweets with scores ranking from −1 to −0.05 are considered negative and tweets with scores ranking from 0.05 to 1 are considered positive. Neutral tweets have scores from -0.05 to 0.05, but they are not the main focus of our research. This classification criteria avoids the bias involved in assigning the tweets with scores very close to 0 a positive or negative sentiment, minimizing the false positives and negatives [26,27].
Hence, we were able to score whether the conversations that contained the words "sustainable" and "sustainability" happened in a positive or negative context in the social network community.
According to Figure 1, the distribution of tweets by their sentiment score shows that 8,813 tweets (62.82%) had a positive sentiment, 3,061 tweets (21.82%) were considered neutral, and 2,155 tweets (15.36%) had a negative sentiment. Therefore, we can see that the conversation about sustainability (no matter what the terms "sustainable" or "sustainability" are referring to) happens in a positive or neutral context most of the time (84.64% of the tweets).

Word Cloud Analysis and Segmentation
A word cloud is a visual representation of text that is developed based on the frequency of the usage of each word in the data collection. Word clouds are an increasingly popular method as they are very efficient in summarizing large amounts of data, in this case tweets from Twitter, and depict the ideologies behind a textual discourse [34,35]. In this research, the aim of the word cloud analysis

Word Cloud Analysis and Segmentation
A word cloud is a visual representation of text that is developed based on the frequency of the usage of each word in the data collection. Word clouds are an increasingly popular method as they are very efficient in summarizing large amounts of data, in this case tweets from Twitter, and depict the ideologies behind a textual discourse [34,35]. In this research, the aim of the word cloud analysis was to understand the most relevant topics and concepts linked to the two keywords-"sustainable" and "sustainability"-in the conversations that take place on social media.
In addition to this, we applied a clusterization consisting of the conjunction of the word cloud analysis and the sentiment analysis NLP performed in the previous stage of this research. As a result, we obtained a word cloud that identified the most relevant topics that raise positive feelings (with a weight of the 62.82% of the tweets in the sample) and another one that identified the most relevant topics that raise negative feelings among the contributors on the social network (with a weight of 15.36% of the total sample of tweets). For this purpose, we used in our python code several existing python libraries, such as matplotlib, pandas, and wordcloud, that are well known among the scientific and data science communities [36] We designed the word clouds to show the one hundred most repeated words and word phrases for the two clusters of positive ( Figure 2a) and negative sentiment (Figure 2b) tweets, obtaining two different word clouds, and enriching the existing literature on word cloud segmentation [37,38].
Sustainability 2020, 12, x FOR PEER REVIEW 6 of 19 was to understand the most relevant topics and concepts linked to the two keywords-"sustainable" and "sustainability"-in the conversations that take place on social media. In addition to this, we applied a clusterization consisting of the conjunction of the word cloud analysis and the sentiment analysis NLP performed in the previous stage of this research. As a result, we obtained a word cloud that identified the most relevant topics that raise positive feelings (with a weight of the 62.82% of the tweets in the sample) and another one that identified the most relevant topics that raise negative feelings among the contributors on the social network (with a weight of 15.36% of the total sample of tweets). For this purpose, we used in our python code several existing python libraries, such as matplotlib, pandas, and wordcloud, that are well known among the scientific and data science communities [36] We designed the word clouds to show the one hundred most repeated words and word phrases for the two clusters of positive ( Figure 2a) and negative sentiment (Figure 2b) tweets, obtaining two different word clouds, and enriching the existing literature on word cloud segmentation [37,38].
The larger the size of the word or word phrase in the word cloud the greater its frequency. The different colors are simply for facilitating the visualization of the content. The following bar diagrams in Figure 3 show the quantitative representation of the top words or word phrases in terms of absolute frequencies for both positive ( Figure 3a) and negative ( Figure  3b) word clouds. Hence, these are the words with the highest frequencies in the tweets when we segmented them by their sentiment.

Social Network Analysis
Gephi is an open source software used for network analysis and visualization. This powerful software allows for the exploration and visualization of large networks of any kind, providing features such as high-quality layout algorithms, clustering, and sample filtering by specific characteristics of the network and statistics of the network as a whole, but also the nodes, edges, and its dynamic [28]. The larger the size of the word or word phrase in the word cloud the greater its frequency. The different colors are simply for facilitating the visualization of the content.
The following bar diagrams in Figure 3 show the quantitative representation of the top words or word phrases in terms of absolute frequencies for both positive ( Figure 3a) and negative (Figure 3b) word clouds. Hence, these are the words with the highest frequencies in the tweets when we segmented them by their sentiment.
Sustainability 2020, 12, x FOR PEER REVIEW 6 of 19 was to understand the most relevant topics and concepts linked to the two keywords-"sustainable" and "sustainability"-in the conversations that take place on social media. In addition to this, we applied a clusterization consisting of the conjunction of the word cloud analysis and the sentiment analysis NLP performed in the previous stage of this research. As a result, we obtained a word cloud that identified the most relevant topics that raise positive feelings (with a weight of the 62.82% of the tweets in the sample) and another one that identified the most relevant topics that raise negative feelings among the contributors on the social network (with a weight of 15.36% of the total sample of tweets). For this purpose, we used in our python code several existing python libraries, such as matplotlib, pandas, and wordcloud, that are well known among the scientific and data science communities [36] We designed the word clouds to show the one hundred most repeated words and word phrases for the two clusters of positive ( Figure 2a) and negative sentiment (Figure 2b) tweets, obtaining two different word clouds, and enriching the existing literature on word cloud segmentation [37,38].
The larger the size of the word or word phrase in the word cloud the greater its frequency. The different colors are simply for facilitating the visualization of the content. The following bar diagrams in Figure 3 show the quantitative representation of the top words or word phrases in terms of absolute frequencies for both positive ( Figure 3a) and negative ( Figure  3b) word clouds. Hence, these are the words with the highest frequencies in the tweets when we segmented them by their sentiment.

Social Network Analysis
Gephi is an open source software used for network analysis and visualization. This powerful software allows for the exploration and visualization of large networks of any kind, providing features such as high-quality layout algorithms, clustering, and sample filtering by specific characteristics of the network and statistics of the network as a whole, but also the nodes, edges, and its dynamic [28].

Social Network Analysis
Gephi is an open source software used for network analysis and visualization. This powerful software allows for the exploration and visualization of large networks of any kind, providing features such as high-quality layout algorithms, clustering, and sample filtering by specific characteristics of the network and statistics of the network as a whole, but also the nodes, edges, and its dynamic [28].
Feeding Gephi software with the sample of tweets collected in the first stage of this research, we have analyzed the dynamics of the accounts that contributed with tweets that contained at least one of the two words (sustainable, sustainability) and also the accounts that acted as prescribers and retweeted these contents within the period of observation (from 23th of November to the 22th of December). Hence, this network representation is focused on the activity of the accounts that generated tweets or interacted as prescribers of these tweets (retweeting) that contain one of the two words "sustainable" or "sustainability".
For this purpose, we used the ForceAtlas2, a continuous graph layout algorithm provided by Gephi. "ForceAtlas2 is a force directed layout: it simulates a physical system in order to spatialize a network. Nodes repulse each other like charged particles, while edges attract their nodes, like springs. These forces create a movement that converges to a balanced state. This final configuration is expected to help the interpretation of the data." [39]. This network allowed us to understand how the communities interact, and their taxonomy [40].
The description of the topology of the directed network in Figure 4 is as follows: The number of nodes was 15,666 and the number of edges was 5,215. Nodes are both accounts that create the tweets and accounts that interacted with these tweets, acting as referrers, spreading the message in form of a retweet. This means that the average degree of the network, measure of connectivity among the accounts, was 0.524, proving that the structure of the network consisted of many different communities, many of them really small (grey dots in the Figure 4) and some others more relevant (large colored dots in Figure 4), having conversations about the sustainability topic but not really connected among them. This was also validated by the average clustering coefficient of the single nodes, which was 0.001, which is very low but an expected figure as we were working with a network where the interactions among users consist of retweets. Feeding Gephi software with the sample of tweets collected in the first stage of this research, we have analyzed the dynamics of the accounts that contributed with tweets that contained at least one of the two words (sustainable, sustainability) and also the accounts that acted as prescribers and retweeted these contents within the period of observation (from 23th of November to the 22th of December). Hence, this network representation is focused on the activity of the accounts that generated tweets or interacted as prescribers of these tweets (retweeting) that contain one of the two words "sustainable" or "sustainability".
For this purpose, we used the ForceAtlas2, a continuous graph layout algorithm provided by Gephi. "ForceAtlas2 is a force directed layout: it simulates a physical system in order to spatialize a network. Nodes repulse each other like charged particles, while edges attract their nodes, like springs. These forces create a movement that converges to a balanced state. This final configuration is expected to help the interpretation of the data." [39]. This network allowed us to understand how the communities interact, and their taxonomy [40].
The description of the topology of the directed network in Figure 4 is as follows: The number of nodes was 15,666 and the number of edges was 5,215. Nodes are both accounts that create the tweets and accounts that interacted with these tweets, acting as referrers, spreading the message in form of a retweet. This means that the average degree of the network, measure of connectivity among the accounts, was 0.524, proving that the structure of the network consisted of many different communities, many of them really small (grey dots in the Figure 4) and some others more relevant (large colored dots in Figure 4), having conversations about the sustainability topic but not really connected among them. This was also validated by the average clustering coefficient of the single nodes, which was 0.001, which is very low but an expected figure as we were working with a network where the interactions among users consist of retweets.  According to this, the average path length of the edges that connects the nodes was 1.027, meaning that one account that acts as referee and retweet contents from a specific account, will spread the message to their own portfolio of followers but not much further in terms of length. This was also validated by the network diameter of 3, that is the shortest distance between the most distant accounts in the network.
In the directed network, betweenness centrality, closeness centrality, and eccentricity were calculated using Brandes algorithm [41]. Betweenness centrality represents how often a node appears According to this, the average path length of the edges that connects the nodes was 1.027, meaning that one account that acts as referee and retweet contents from a specific account, will spread the message to their own portfolio of followers but not much further in terms of length. This was also validated by the network diameter of 3, that is the shortest distance between the most distant accounts in the network.
In the directed network, betweenness centrality, closeness centrality, and eccentricity were calculated using Brandes algorithm [41]. Betweenness centrality represents how often a node appears on shortest paths between nodes in the network (Table 1); closeness centrality is the average distance from a given starting node to all other nodes in the network and eccentricity is the distance from a given starting node to the farthest node from it in the network. The highest betweenness centrality of an account in this network is 22. This node (mvollmer1) is the one that has the most control over the network as it is the one that the most information passes through. The ten accounts with the highest figures are as follows: Closeness centrality provides us the information about the accounts that are able to spread the information in a very efficient way through the network.
Network diameter is the maximum eccentricity among the nodes. There were just five nodes that had an eccentricity equal to 3.
Finally, the modularity, with a value of 0.996, allows us to identify the different communities in the network. They are represented by the different colors in Figures 4 and 5. We used the Louvain method [42] for community detection, created where modularity ranges from -1 to 1 and measures the density between the edges inside the communities and edges outside the community. The modularity is represented in Figures 4 and 5 by the different colors.
The size of the node/accounts are represented by their out-degree centrality-number of links established with other accounts that retweeted their content that contained the two keywords under investigation. GreenpeaceJP had the highest out-degree with 75 links, followed by MikeHudema with 62, XR_NYC with 51, and WEF with 49. The size of the node/accounts are represented by their out-degree centrality-number of links established with other accounts that retweeted their content that contained the two keywords under investigation. GreenpeaceJP had the highest out-degree with 75 links, followed by MikeHudema with 62, XR_NYC with 51, and WEF with 49.

Empirical Analysis and Results
What was observed in the sampling of tweets and retweets made was that the term sustainability is now clearly associated with environmental sustainability. When the 100 most common words were taken from tweets scored as positive or negative, we found that at least 15 of the 50 most frequent terms shared conversation both from the positive and negative ("new", "future", "people", "time", "way", "good", "environment", "like", "want", "world", "climate change", "business", "development", "long term"). Therefore, beyond the negative or the positive feeling of the tweet where the term was used, there was a set of signifiers that were used interchangeably and that, in some way, strengthened the understanding of the concept. However, in the application of the method explained above, the definitive thing here is how, through an accurate process of clustering, those words and others are used on tweets with a positive or negative feeling.

Empirical Analysis and Results
What was observed in the sampling of tweets and retweets made was that the term sustainability is now clearly associated with environmental sustainability. When the 100 most common words were taken from tweets scored as positive or negative, we found that at least 15 of the 50 most frequent terms shared conversation both from the positive and negative ("new", "future", "people", "time", "way", "good", "environment", "like", "want", "world", "climate change", "business", "development", "long term"). Therefore, beyond the negative or the positive feeling of the tweet where the term was used, there was a set of signifiers that were used interchangeably and that, in some way, strengthened the understanding of the concept. However, in the application of the method explained above, the definitive thing here is how, through an accurate process of clustering, those words and others are used on tweets with a positive or negative feeling.

Positive Sentiment Cluster
When we deconstructed some of these relationships (Figure 2a), we saw that the use of "new" in tweets associated with a positive feeling (the most prominent term by absolute frequency according to Figure 3a), may indicate that the term "sustainability" is new in terms of its social use. It is used in accordance with the idea that, socially, the current modus vivendi is not sustainable in the UN's sense of environmental sustainability, when it was defined in 1987, and as the ability to cope with desire satisfaction through the present consumption of goods and services without jeopardizing the well-being of future generations.
From a social point of view, what is new is the emergence, in the individual and collective preference map, of 'environmental sustainability', with its own identity and strength, and its influence on decisions about the use and consumption of goods and services that the population makes.
No previous generation has had before it, with obvious concern, an environmental problem such as what we know as climate change, caused by past and present generations. What is new is that there was a set of natural and environmental goods and services, which, being an obvious part of the real well-being of people, did not need to be defined with regard to property rights about them. This is a good reason to underline a keyword in Olawumi and Chan-"environment" [16]. However, it is not new to the social sciences, since Hardin had already defined it as "the tragedy of the commons" [19]. In an excellent work, never well known, Georgescu-Roegen also made this clear, pointing out that perhaps social scientists and, especially economists, did not raise these issues earlier because of the "absence of difficulties" in the supply of raw materials by the most advanced countries, even when in the past, wars for its control were unleashed [21]. The growing scarcity of natural resources threatens the stability of our modus vivendi itself and this, combined with climate change forces us to consider the situation differently.
In the positive sense, it is also new that "sustainability" and "sustainable" are presented in the conversations as something closely related to the use of 'energy'. "Energy" is also a key word in Olawumi and Chan, in that that the new uses of energy require immediate support [16]. In addition, with a high frequency, "sustainable" and "sustainability" are being related to the future of "the people". On a second level of frequency, terms have a logical value that only extends to the first idea, in the sense of associating with a problem of time, causing us to seek a way that allows us to see something good in what we call 'the environment', that we want to keep in some certain levels of quality and quantity. To do this, the dimensions of such a project have a world scope.
The project started well in that social networks considered sustainability based on energy use to be of the utmost importance, since the current pattern of fossil fuel-based energy consumption is the main anthropogenic source of climate change. So, sustainable energy is seen as a positive solution. In the current scenario the world renewable electricity generation will reach only a little more than 17,000 TWh in 2040, and we need more than 25,000 TWh by that date to be in line with the Sustainable Development Scenario [43]. It is clear that an effective fight against climate change requires us to act on the patterns of energy evolution.
When social networks like Twitter use the words "future", "time", and "way", in relation to environmental sustainability, they are understood in the same sense as in the field of sciences. This is something that concerns time and the future and, therefore, there cannot be an approximation that is not essentially dynamic. In this sense, the importance of the term "way" in this context is underlined, since from a social point of view the key to sustainability requires a concrete way of doing things and designing the future [22].
Here, the question of whether this clashes with the mainstream and more ingrained current conceptions in the field of social sciences arises, and especially with the economic analysis of natural and environmental resources. For some illustrious economists, this concept of sustainability is "essentially vague" [24]. According to Solow, the question in terms of material well-being is whether we can assure future generations of the same level of well-being as present generations. For this, a sustainability rule is needed that ensures the highest level of well-being, updated for the present and for the future [23]. If welfare is expressed through the consumption of goods and services, the maximization of the same requires a trade-off between goods of the present and goods of the future. The well-being of the present requires a more intensive use of natural resources that will eventually become, in the best case, capital goods for the future. This new capital will make the future more productive, although with a smaller allocation of natural resources, higher man-made physical capital, higher human capital, and higher technological capital, which will compensate in terms of well-being with the loss of future natural capital.
In these terms, the concept of sustainability is associated with an opportunity cost of the future [43]. The future appears as an untouchable horizon in terms of present well-being and, therefore, as an obligation on the present that restricts the current framework of opportunities for decision-making that has intertemporal effects. However, what has to be sustainable in the long term is well-being [24] and, to that extent, it may be represented by different proportions of natural capital, manufactured capital, human capital, and technical capital. In this regard, it is very useful to study the analysis by Hartwick [44] to see the conception that underpins this sustainability rule. In another order of things, we should always remember the seminal article of Hotelling [23], in which a dynamic and long-term approach is given to any issue related to the management of natural resources, even if in that case the study was related to the intertemporal optimal allocation of nonrenewable resources, such as those of mining.
However, if likely combinations entail irreversibility costs and increasing uncertainty about the same level of well-being in the future, a strategy of natural capital conservation would have an obvious economic point [45]. To the extent that one accepts a greater number of combinations, one can speak of weak sustainability, and to the extent that substitutability between the components of possible combinations is expressly limited, while retaining the initial allocation of natural capital or by significantly restricting the use of this initial natural capital, there is talk of strong sustainability [46]. This is the basis on which the paradigm of the so-called ecological economy takes turns out [47,48].
The question of the future is not only about whether a critical mass of natural capital should be left at all times in order to ensure environmental sustainability, but, as has been said, the debate on time and sustainability often revolves around cost opportunity for that future [43]. To do this we must look at the real interest rate and the discount rate, that is, know what is the remuneration which is being waived in the present for making available resources for the future or, alternately, what is the cost to us in the present for the preservation of resources for those who belong to the future. This is where it is noted that the issue of environmental sustainability or, most specifically, the fight against climate change is not disputed as to whether it needs to be addressed as a dynamic long-term problem, but precisely because the long term is involved, what should be the discount rate to apply, since one value or another is a crucial thing to know-whether to incur costs today or defer the solution to deal with resources generated in the future [49][50][51]. The importance of this literature has been a point of debate for more than twenty years in regards to what should be the discount rate to use in the updated calculations of the values of future welfare losses caused as a result of effective climate change.
If we now go to the following most frequent terms in our word cloud, we found "people". The word refers to a very inclusive, obviously popular social basis and the foundation of the democratic regime. It should be remembered, for instance, that in a typical democratic political constitution like that of the US of 1787, the text begins with a "We the people..." and it has a lot of weight in defining the concept of democracy. This perception is very consistent with some relevant contributions in the field of social sciences [45], when the hypothesis of whether democracies as forms of political organization imply a greater commitment to quality international cooperation in this area. The results in Neumayer are conclusive, since the estimators have the expected sign and are significant statistically, but are equally significant and relevant when the democracies of the most advanced countries are excluded from the samples.
Similarly, the term "people" connects with "future". On one hand, there is a perception that it is the youngest people who have taken the lead in the fight for environmental sustainability. Thus, the young Greta Thunberg, born in 2003, is an influencer of this social movement who, because of her age and her likely level of education, is unlikely to have any advantage in terms of knowledge and experience over other generations. This movement gains an advantage in its use of social media, since we cannot ignore that they are already a generation of people born to digital society. In any case, for purely probabilistic reasons, the future will affect them more than any other population cohort.
Moreover, they are joined by a longer-term vision, since they are less concerned about their lives at their present time than in the future. In relation to this issue, one issue that is not a minor one is the claim of young people requesting the right to political choice, which would involve lowering the voting age to 16, which in many developed countries is the age that marks the ability to access the labor market. In short, this clear reference to people in general and not to specific groups greatly universalizes the use of sustainability. That is, it makes it a very cross-cutting term from a social point of view.
In this way we reach a fourth frequency level in the conversations, in which it is clear that the fundamental issue is climate change and that solutions are associated with both the microeconomic (business) and the macroeconomic, needing a long term approach in which industry and agriculture become"green" and "better" and can be actioned through communities in the local realm, using for this purpose both innovation and investment. Every word tweeted is very closely related to the aforementioned issues.

Negative Sentiment Cluster
From a negative perspective (Figures 2b and 3b), "sustainability" is associated with many terms similar to those already seen in the positive sentiment; even though with a much lower frequency in all cases. "Sustainable" and "sustainability" are terms that allow us to overcome an increasingly overwhelming "climate crisis", asociated to other key words like "climate change" [16]. In other words, from a social point of view, the 'climate crisis' is considered a new concern about the lack of sustainability in the way of life that we follow today. As before, it is something new that involves a climate change that affects "people", where the future is at stake. Now, with a negative perspective, the time is limited, so we need to find (make) "climate solutions", to attack in concrete fashion the "problem" of the environment, which is seen with a long-term perspective and that has to accompany "development". In this area, the most negative term is diagnosed as a 'climate emergency', in which "economy", "waste", "food waste", and "growth" are involved. These occur always in a "world environment", involving ("want", "like") a "climate action" that affects "business" and, no less important, "cost", so that "money" is seen as part of the problem.
What was most surprising in the results of this word cloud analysis was the similarity of terms, including the importance given to them, when comparing these social networks with the word cloud taking the sample of scientific research conducted in recent years in different fields [16]. It is the words "climate change", "impact", "innovation", "policy", "design" that dominate those professions, while social media conversations move in the same semantic territories.

Social Networks Analysis
Social networks will have different characteristics depending on the topics and the relationships that are studied [39]. Our network was made up of accounts with tweets that contained the words "sustainable" or "sustainability" and also the accounts that acted as prescribers retweeting this content ( Figure 5). The network shows how content regarding sustainability spreads within the Twitter social network. There are many micro-communities, but they are not closely connected. We used centrality measures to describe how this network works.

Out-Degree Centrality Measure
Using the results of out-degree centrality ( Figure 5 and Table 2), we observed that, in general, there were many communities of accounts interacting with a main contributor, but there were few connections among these communities. We found that nodes closer to active militancy for environmental sustainability had a higher out-degree. Among them, is that of Greenpeace Japan, echoing the Climate Action Network platform, which included an active participation of outstanding Greenpeace Members International Committee. This organization maintained a deep skepticism regarding possible COP25 agreements at its meeting in Madrid and claimed a "Climatestrike", which in that month meant a mobilization, according to its own sources, of more than 7 million people all over the world. Something similar can be noted from the node representing the Twitter activity of Mike Hudema, an environmental activist from Greenpeace Canada, one of the most influential figures of the Canadian State of Alberta, and some of his campaigns. Hudema has more than 120,000 followers on this network.
These two nodes make evident the importance to the network of Action on Climate, the "climate crisis", "green jobs", and the so-called Green New Deal. Other nodes, such as @XR_NYE, appear, with a major impact from a British movement called Extintion Rebellion, which in December 2019 was throwing a party to raise funds to campaign for the emergency through a New Year's Eve party in London, promoted through the network.
Similarly, there were nodes of obvious importance in social networks, according to their out-degree centrality measure, for which "sustainability" or "sustainable" are key terms for more general objectives. This is the case of the @WEF node, referring to the World Economic Forum, which in December pointed towards sustainability as a fundamental element, favoring a "civil society prepared" for what they call the fourth industrial revolution.
More focused on the vision and strategy of sustainable development, though less important than those mentioned above in relation to the out-degree centrality measure, came very notable nodes, such as @UNESCO and United Nations Secretary-General, @Antonio Gutierres. The first, with more than 3 million followers, was able to enforce the insertion of sustainable development into the Global Goals. The second, with 733,000 followers, pointed out within our dates that "sustainable development is seriously off-track" and that sustainable development is "more than just a goal". The @EU_Commission node, with more than 400 retweets stressing that "sustainability is part of Europe's DNA" and calling for the so-called "The European Green Deal" is also of note.
On the other hand, there were nodes of manifest importance, from the out-degree centrality measure of the social network, that responded to the call to sustainable development or sustainability, but based on a political agenda or simple topicality. This is not to say that the matter is not relevant to the community of users to which they are central. This would be the case of nodes such as @DavMicRot, economist, consultant, very active on the topic of differentiating political formations according to their "sustainable capitalism", with more than 200 retweets and a project called "Threader", which generates more than 500,000 visits a month on the internet. The @JamesMelville node, in which a Scottish consultant in sustainable development of the same name responds, then draws on his position to discuss international progress in sustainable development and the barriers he sees for its progress, in a context of reflection on British general issues, particularly Scottish, has more than 140,000 followers.
Something similar can be noted from a node like @GeorgePeretz, with more than 15,000 followers, dealing with different current affairs, from the point of view of a London lawyer. Other relevant nodes respond to these same characteristics, including the Arik Ring. Further away in terms of the issues dealt with on a daily basis, we can also point to influencers who at some point have regarded the issue of environmental sustainability as an element to take into account in the way of life of them and their followers. This would be the case of @Nalisaaa, with almost 30,000 followers. In this area we should also point out the highlight of the @queerstorian node, as an influencer that supports the collection of money for the construction of a "sustainable" house of a transgender Colombian woman. This Twitter account, with a little under 500 followers gets retweets of almost 1000. In another order of magnitude, from the combined world of music and musicians, appears the node @katyperry, a singer who, with more than 108 million followers, tweets that she feels "proud to be part of the January Sustainability by Vogue Magazine", retweeted more than 600 times.

Betweenness Centrality Measure
The betweenness centrality measure allows us to measure not only the number of connections of a node, as is the case of out-degree, but the importance of that node as others go through it. This measure allows us to know, in terms of retweets, who are the ones who actually spread conversations, ideas, and suggestions on social networks about how to deal with issues related to sustainability and the sustainable. In our network, only the first two reached a value of betweenness well away from the rest. Everyone else took a value of half or less than the first.
The first thing of note is that the relationship of tweet-generating nodes ( Table 3) has little to do with those who are transmitters of conversations, messages, and news through retweets (Table 1). Thus, the node that achieved the most betweenness was @mvollmer1, who could be considered as a technological influencer, positioned as head of innovation of a major company, Celonis. This Twitter account has been able to accumulate more than 45,000 followers, and devotes special attention to digital supply chains in their most sustainable versions. In this case the connection with @mikehuddema, one of the most meaningful nodes from the point of view of the out-degree, is interesting, and allows us to think about the importance of its connection with the environmental movement and the challenges it poses for innovation, robotics, or artificial intelligence. Thereafter, other nodes are receiving all this information not in a straight line from Huddema but from Vollmer. Something similar resulted from some of the highest betweenness. An example is @haroldsinnott, a technology influencer recognized in Florida as one of the most influential professionals in business intelligence, with more than 65,000 followers.
Other nodes, such as the Californian @earthaccounting, with more than 21,000 followers promoting sustainable products and producers, have a manifest interest in spreading the culture of sustainability to the fullest and retweeting everything that involves its dissemination and reinforcement.
The node with the second highest betweenness was @sdg30 account, managed from the United Nations and focused on the organization's strategy for disseminating the organization's 2030 sustainable development goals. This node continuously retweeted any news that had to do with the seventeen objectives of the sustainable development strategy. It was therefore a continuous dissemination of the terms sustainable and sustainability associated with very different topics.

Other Network Centrality Measures
The closeness centrality measure gives us a measure of how fast the tweets spread from one account to other accounts, in the form of retweets. In our case (Table) it was observed that just over a quarter of the nodes (27.17%) achieved a level of efficiency in the transmission of messages, while almost three quarters of the nodes (72.24%) in the network were unable to connect with the rest. The network was therefore not only structured around communities of nodes that were very consistent and relatively isolated, but a good part of the nodes were simple prescribers, retweeting from other accounts' content, without no diffusion capability.
In regards to modularity centrality measure, we noted that there is a strong community structure, where the accounts of the different communities are strongly connected with the main contributor, but later the different communities are almost unconnected amongst themselves. High modularity is a characteristic of biological and other real-world networks.

Discussion and Implications
In general, sustainability is an issue of current importance and fashion, a fact which is reflected in the conversations generated on social networks, mainly linked to environmental sustainability. In order to contrast certain hypotheses about the sustainability concept and its use on social media, we used a social listening method. This method avoids biases as it analyzes a representative sample of tweets gathered from Twitter, in which the researchers have not mediated in any way. Analyzing these conversations and messages achieves a momentum effect, difficult to appreciate by other means. It also is relevant because of the cost-effectiveness achieved with its application.
As a result of the research, the findings supported H1, showing that the use of social listening in Twitter can build a map that systematizes the feelings, opinions, and characteristics of the messages issued by users of the social network in relation to a particular topic, such as 'sustainability'. The investigation also confirmed H1a, as the social media approach to the environmental sustainability dialogue coincides fundamentally with what is said in other more formal settings, such as academia, national or international agencies, etc. confirming that social listening is a useful and a complementary tool for research for this topic and in the current context. It also serves to evaluate, not only in this topic, but also in others, if there are dissonances between what society thinks and what science and experts think.
In relation to the hypothesis raised about the sentiment generated by the conversation, it was shown that it was mainly positive, confirming H1b. Thus, 62.82% of tweets have positive sentiment and dialogue ensues for the construction, improvement, and search of solutions. Conversations developed in a context of positive feeling focused on social issues. Central to the groups are people in search of renewal and the construction of a different future, based on different behaviors, mainly related to, and with the goal of achieving, greater well-being in a long-term perspective.
The negative focus in the dialogue focused on the lack of sustainability in an environmental framework. They show concern and discontent about how sustainability is at risk. The conversations developed in the negative context focused on concern about climate change, confirming the hypothesis that society believes we are already facing a climate crisis and that it will have an irreparable impact on the future.
Our research also confirmed that Twitter is a valuable source of information for researchers and organizations. We showed that it is possible to gather a sample of tweets about a specific topic and develop a robust social network map of accounts according to their activity and the interactions among them, confirming H2. The network can be designed as a characterization of participants on social networks, both those who generate the content and those who disseminate and spread it. We confirmed that the "sustainability" concept is part of many global conversations in micro-communities almost not connected among themselves. There are a few communities that are relevant and led by important players, and they are mainly institutional, such as Greenpeace, Mike Hudema, Extinction Rebellion NYC, or the World Economic Forum, etc., but the greater part of the conversations happens in thousands of small communities without any relevant leader and little capacity for the dissemination of their message. This is a sign that this topic is global in nature, not monopolized by anybody, and it can be considered a trending topic nowadays.
We also identified the role of some accounts that are influencers and act as message amplifiers from the main accounts that contribute about sustainability. These influencers are able to disseminate a specific message very widely through their followers. The main characteristic of these accounts is that they are linked to the world of science, technology, and activism, such as Dr Marcell Vollmer, SDG2030, or Harold Sinnott, etc. This means that in the composition of the most relevant communities, there are accounts with very different roles that contribute not only to generate content, messages, and conversation, but also to spreading them among the audience.

Conclusions
This research represents a novelty as, to our knowledge, it is the first investigation that analyzed the concept of sustainability in depth using social listening on Twitter, one of the most relevant social networks today.
We used the advantages of the Twitter social network to gather a representative sample of data without interfering in the activities or conversations of the individuals, minimizing the sample bias issue of other traditional methods.
Applying a bunch of different technologies and methods, such as natural language processing (NLP), clustering, and social network analysis (SNA), we have fulfilled two different objectives. On the one hand, we analyzed the different context and areas of knowledge where the concept of "sustainability" was used, and the feelings that these conversations produced among the users on the social network. Using natural language processing (NLP) for sentiment analysis we have seen that the concept of sustainability is used mainly in a positive way in conversations on social media. From the whole sample of tweets, 84.64% of them were scored as positive or neutral, while only 15.36% were scored as having a negative sentiment. This research also provides an overview of the different topics to which the concept of "sustainability" is linked nowadays when generating both positive and negative sentiment.
Positive feelings associated with sustainability are linked to the topic of "new", while the negative is linked to the topic of "climate crisis". Hence, in this research we extended the application of NLP and clustering based on sentiment analysis methods from previous research [26,27,33].
In addition, we drew the social network of users that generate content about sustainability or spread it using Gephi software, showing that sustainability is, in fact, a trending topic on social media with thousands of micro-communities having conversations about it, but without anybody leading or guiding these conversations [28]. We also identified the different kind of accounts that play a key role in the generation and spreading of messages on social networks.
Thus, we have confirmed that social media listening is a powerful tool to complement more formal fields of research with the additional benefit of providing real time information on many topics of interest in the social sciences. In this sense, this paper complements previous literature about concept of "sustainability" [16], bridging a gap and paving the way for new lines of research based on data analysis on social media that can be readily implemented for the benefit of both researchers and also organizations.
Despite its strength, the study has some limitations that indicate directions for future research. Firstly, the large data sample of tweets was gathered within a month, December 2019. Future research should validate that there is no seasonal bias and the results and conclusions remain stable using a new data sample of tweets gathered in a different period of time. In addition to this, we would like to go deeper into the understanding of the dynamics of the social network around the concept of "sustainability" and see how the relationship among accounts evolves over a period of time, providing a longitudinal view of the network.
Secondly, our gathered sample of tweets only contained searchable public contents published by the users on the social network. It means that there is content that cannot be accessed if their owners choose to keep it private.
Finally, we deliberately limited the sample of tweets to those written in the English language. Hence, this analysis applies to the concept of "sustainability" used by English speakers and it would be interesting to replicate the same analysis for other languages, such as Spanish, to understand if different cultures may interpret and use the concept in a different way.