A Novel Framework for Mining Social Media Data Based on Text Mining, Topic Modeling, Random Forest, and DANP Methods

: The huge volume of user-generated data on social media is the result of the aggregation of users’ personal backgrounds, past experiences, and daily activities. This huge size of the generated data, the so-called “big data,” has been studied and investigated intensively during the past few years. In spite of the impression one may get from the media, a great deal of data processing has not been uncovered by existing techniques of data engineering and processing. However, very few scholars have tried to do so, especially from the perspective of multiple-criteria decision-making (MCDM). These MCDM methods can derive inﬂuence relationships and weights associated with aspects and criteria, which can hardly be achieved by traditional data analytics and statistical approaches. Therefore, in this paper, we aim to propose an analytic framework to mine social networks, feed the meaningful information via MCDM methods based on a theoretical framework, derive causal relationships among the aspects of the theoretical framework, and ﬁnally compare the causal relationships with a social theory. Latent Dirichlet allocation (LDA) will be adopted to derive topic models based on the data retrieved from social media. By clustering the topics into aspects of the social theory, the probability associated with each aspect will be normalized and then transformed to a Likert-type 5-point scale. Afterwards, for every topic, the feature importance of all other topics will be derived using the random forest (RF) algorithm. The feature importance matrix will be transformed to the initial inﬂuence matrix of the decision-making trial and evaluation laboratory (DEMATEL). The inﬂuence relationships among the aspects and criteria and inﬂuence weights can then be derived by using the DEMATEL-based analytic network process (DANP). The inﬂuence weight versus each criterion can be derived by using DANP. To verify the feasibility of the proposed framework, Taiwanese users’ attitudes toward air pollution will be analyzed based on the value–belief–norm (VBN) theory by using social media data retrieved from Dcard (dcard.tw). Based on the analytic results, the causal relationships are fully consistent with the VBN framework. Further, the mutual inﬂuences derived in this work that were seldom discussed by earlier works, i.e., the mutual inﬂuences between altruistic concerns and egoistic concerns, as well as those between altruistic concerns and biosphere concerns, are worth further investigation in future. trial


Introduction
Social media are web-based services that allow people, publics, and organizations to cooperate, link, network, and form communities. Such services allow users to easily generate, co-generate, adapt, share, and participate in web contents created by users [1]. In the past few years, social media have become a dominant part of daily life for most people, with enormous implications and impacts on regional, national, and global economies and political situations [1]. At the moment when the impacts of conventional media lessened, social media rapidly diffused into the world. widely adopted. These MCDM-based methods can actually give insights into decisionmaking problems, e.g., the influence relationships and influence weights, which statistical methods-based analytic frameworks cannot afford. The integration of MCDM methods with big data analytics in general, and social media mining in particular, has been rare. However, their integration can indeed derive very different results compared to those methods that integrate big data analytics with a statistical analysis method, e.g., social media mining with PLS-SEM.
Data retrieved from social media usually contain meaningful information. However, few scholars have tried to analyze these data based on decision-making methods. A document usually contains numerous topics; according to Chen et al. [13], even a short document may contain multiple topics. These topics can serve as the criteria for a decisionmaking problem, and the problem is, by nature, a MCDM one. The influence relationships among the major variables in the social media data and the weights associated with these variables can be derived in order to provide meaningful insights. However, based on the authors' limited knowledge, very few scholars have tried to mine social media using MCDM methods. Although MCDM methods can potentially provide specific insights into the data retrieved from big data in general, and social media data in particular, few scholars have tried to propose analytic frameworks to address this research gap. Furthermore, almost no scholars have tried to propose an integrated framework to derive the influence relationships among the aspects of a theoretical framework. Thus, it is necessary to integrate information retrieved from social media sites into an established theoretical framework. Therefore, in this paper, we aim to propose an analytical framework to mine a social network, analyze the meaningful information using decision-making methods based on a specific theoretical framework (e.g., the technology acceptance model or the valuebelief-norm theory [14]), derive causal relationships among the aspects of the theoretical framework, and, finally, compare the causal relationships with a social theory.
First, social media sites will be trawled. The user-generated contents related to some specific social issue(s) will be retrieved. Then, the Latent Dirichlet allocation (LDA) technique will be adopted to derive topic models based on those data retrieved from social media. According to the probability associated with each topic, the topics will be clustered. Then, these topics will be classified into a specific aspect of a model of a social theory. To feed the probability of data into the computation, the probability associated with each aspect of the model of the social theory will be normalized using a Likert-type 5-point scale. Afterwards, for every topic, the random forest (RF) algorithm will be adopted to derive the feature importance of all other topics. The feature importance matrix will be transformed into the initial influence matrix of DEMATEL. The influence relationships can be derived, along with the influence weight versus each criterion, by using DANP. The consistency between the influence relation map (IRM) and the social theory model will be checked. Discrepancies will be derived, which can provide further insights regarding social phenomena. The contents generated by Taiwanese users regarding attitudes toward the air pollution problem will be retrieved from Dcard (www.dcard.tw, access on 1 July 2021) to verify the feasibility of applying social media data to the value-belief-norm theory proposed by Stern et al. [14]. For readers' convenience, a list of abbreviations and symbols introduced in this work are listed in Tables A1 and A2 in Appendix A.
The remainder of this paper is organized as follows: Section 2 reviews the relevant literature regarding the emergence of social media, the mining of social media, data-driven decision-making (DDD), past works on the integration of data analytics and MCDM methods, and research gaps. Research methods, which include the analytic process, topic modeling, RF, DEMATEL, and DANP, will be reviewed in Section 3. Section 4 presents the analytic results of text mining, topic modeling, cluster analysis, DEMATEL, and DANP. Finally, the results are discussed in Section 5. Section 6 concludes the whole work.

Literature Review
According to Kaplan and Haenlein [15], social media are the set of internet-based applications which are built upon the concepts and technology of Web 2.0; social media enable the generation and exchange of content generated by users [2]. Numerous classes of social media sites have been created. Typical examples include Facebook (for social networking), Twitter (for microblogging), YouTube (for video sharing), etc. [2]. Social media mining is an emerging interdisciplinary research field whose arena includes techniques from computer science, statistics, sociology, and ethnography [2]. DDD is a practice of decision-making, where decisions are based on data analytics instead of on intuitions only [8]. Better data provide more chances for enhanced decision-making results [16]. During the past few decades, MCDM methods have been developed and adopted for numerous applications. However, in the age of big data analytics, DDD based on MCDM methods has seldom been adopted in manipulating big data in general and social media data in particular. Thus, in this section, past works on the emergence of social media, social media mining, DDD, MCDM-based DDD, and research gaps will be reviewed. The literature will serve as the basis for developing the integrated framework consisting of social media mining and MCDM methods.
Social media is not based on a single technology. Instead, social media integrate wide-ranging techniques, which include numerous online services that augment the capability of mutual communication in the social environment that forms the organization [17]. The kernel of social media is grounded on the provision of high visibility and open participation [17]. For practical applications, social media provide features which allow seamless sharing, commenting, responding, syndicating and interacting with content (text, voice and video) and connecting with others, and following and interacting with their activity streams [15,18]. Thus, social media offer a flexible platform which is fundamentally organic, free-flowing, and constructed to enable dynamic and emergent feedback loops of communication within a social group [17].
Nowadays, social media platforms are typically applied in expressing opinions or viewpoints regarding social events, news, etc., everywhere, without any limitation of time. Future prediction is the great wish of mankind [19]. In order to meet this forecasting demand, many studies have correctly proven the importance of social media data (e.g., [10,[20][21][22]). Therefore, during the past several years, scholars (e.g., [23,24]) have demonstrated numerous applications in the related fields of social science [19].
Social media mining refers to the process of characterizing, analyzing, and deriving important patterns from data retrieved from social media, which are the result of social interaction [2]. Social media mining is a multidisciplinary domain which includes techniques from computer science, data engineering, social science, and mathematics [5]. The exploration of social media by the above-mentioned techniques helps us understand the mutual interactions of users [2]. Further, interesting patterns, information diffusion, influence relationships, effective and efficient recommendations, as well as novel social behavior can be explored on social media sites [2]. DDD refers to data analytics-based decisions [8]. Good sources of data imply better opportunities for good decisions [16]. Novel digital techniques have greatly enhanced the quality and quantity of data available for decision-makers [16].
The advantages of DDD have been verified convincingly [8]. Brynjolfsson et al. have demonstrated how companies' performance can be enhanced by using DDD [8]. DDD is also related to better financial results [8]. DDD has been broadly applied in numerous domains such as medical science, environmental engineering, education, energy management, policy definitions, etc. [20].
Nowadays, people are facing complicated decision-making problems that are filled with tremendous information, which can describe diverse aspects of problems via different methods. For decision-makers, uncovering an idea solution to a decision-making problem is not easy [20]. A rational method to tackle this kind of problem is to analyze various aspects and then integrate the analyses to create final solutions to the problems [20]. This choice is called MCDM [20]. During the past few decades, numerous works based on MCDM have been conducted to assist people in solving complicated problems [20].
Traditional MCDM methods such as the AHP, the ANP, DEMATEL, and the DANP have been widely adopted for many decision-making problems. The AHP proposed by Saaty [10] aims to derive the weights relating to each aspect and criterion of a decisionmaking method by assuming independence among these aspects and criteria. Saaty also proposed the ANP [21], which can derive the weights being associated with the aspects and criteria of a decision-making problem by releasing the assumptions of independence. DEMATEL, proposed by Gabus and Fontela [22] of the Battelle Geneva Institute, has been widely adopted to construct the influence relationships among the aspects and criteria of a MCDM problem. The DANP, a fusion of DEMATEL and the ANP, can easily derive the influence weights of each aspect and criterion of a MCDM problem based on the results of DEMATEL. The DANP simplifies the analytic procedure of the ANP-based methods and considers every influence relationship, while deriving the influence weights. In ANPbased methods, a threshold value is usually defined to avoid too much complexity in the structure of decision-making problems to be solved. From a traditional perspective, it is very reasonable to adopt these methods. However, in the era of big data, decision makers can further consider the possibility of incorporating big data into the decisionmaking process instead of relying on a very limited number of experts. In the age of big data analytics, data fill the whole analytic process of MCDM [20]. Therefore, generating reasonable solutions based on contemporary observations and past data has turned out to be a dominant and fascinating matter [20]. To resolve this problem, Fu et al. [20] proposed a DDD framework based on the MCDM method, which has become the focus.
Few scholars have tried to integrate machine learning algorithms and MCDM methods to tackle big data in general and social media data in particular. Recently, Yang et al. [23] used text mining methods to retrieve papers adopting deep learning-a subset of machine learning-algorithms, and MCDM methods in using big data. Limited results were retrieved from major academic databases, including ScienceDirect, ACM, IEEE, Springer, Taylor & Francis, and Wiley Online Library. Some of these works use the AHP to assess risks [24], such preparing a flood hazard susceptibility map [25]. However, as mentioned in the prior paragraph, the assumptions of independence among the aspects and criteria bias the results. Yasmin et al. [26] used intuitionistic fuzzy DEMATEL (IF-DEMATEL) and the ANP to analyze the capabilities of big data analytics for firms. However, they are not really dealing with big data. Meanwhile, the framework faces problems similar to those mentioned in the prior paragraph-the complicated survey procedure and the loss of valuable information due to the threshold definition.
Muruganantham and Gandhi [27] provide one of the few studies to incorporate social media data into a MCDM method. In their study, the Technique for Order Performance by Similarity to Ideal Solution (TOPSIS) was introduced to rank influencers in a given social media data set. However, no influence relationships, weights, or confirmation with theoretical frameworks could be provided due to the natural limitation of the TOPSIS, which aims to rank the alternatives only.
In general, in spite of the impression one may get from the media, much data processing that has not been uncovered by existing techniques of data engineering and processing. Therefore, investigations on the integration of social media, NLP, and other methods of data analytics will be very important for deriving novel implications of the data retrieved from social media in general, and the data related to a specific theoretical framework in particular. However, very few scholars have tried to do so, especially from the perspective of MCDM, which can derive influence relationships, which can hardly be achieved by traditional data analytics and statistical approaches. Therefore, in this paper, we aim to propose an analytic framework to mine social network, feed the meaningful information to MCDM methods based on a theoretical framework, derive causal relationships amongst the aspects of the theoretical framework, and finally compare the causal relationships with a social theory.

Research Methods
First, social media sites will be trawled. The user-generated contents related to some specific social issue(s) will be retrieved. After that, the LDA technique will be adopted to derive topic models based on those data retrieved from social media. According to the probability associated with each topic, the topics will be clustered. Then, these topics will be classified into a specific aspect of a social theory model. To feed the probability of data into the computation, the probability associated with each aspect of the model of the social theory will be normalized using a Likert-type 5-point scale. Next, for every topic, RF will be adopted to derive the feature importance of all other topics. The feature importance matrix will be transformed into the initial influence matrix of DEMATEL. The influence relationships can thence be derived. The influence weight versus each criterion can be derived by using DANP. The consistency between the IRM and the social theory model will be checked. Discrepancies will be derived, which can provide further insights regarding social phenomena. Below, the methods will be introduced. The three data analytic techniques, namely, topic modeling, hierarchical cluster analysis, RF, and DANP methods, will be introduced in the following subsections. These methods will be used to derive data from social media sites, derive latent topics, cluster these topics into theoretical frameworks, derive feature importances, and then feed these feature importances into DANP to derive meaningful implications. The proposed process consists of the following five steps (see Figure 1 below): perspective of MCDM, which can derive influence relationships, which can hardly be achieved by traditional data analytics and statistical approaches. Therefore, in this paper, we aim to propose an analytic framework to mine social network, feed the meaningful information to MCDM methods based on a theoretical framework, derive causal relationships amongst the aspects of the theoretical framework, and finally compare the causal relationships with a social theory.

Research Methods
First, social media sites will be trawled. The user-generated contents related to some specific social issue(s) will be retrieved. After that, the LDA technique will be adopted to derive topic models based on those data retrieved from social media. According to the probability associated with each topic, the topics will be clustered. Then, these topics will be classified into a specific aspect of a social theory model. To feed the probability of data into the computation, the probability associated with each aspect of the model of the social theory will be normalized using a Likert-type 5-point scale. Next, for every topic, RF will be adopted to derive the feature importance of all other topics. The feature importance matrix will be transformed into the initial influence matrix of DEMATEL. The influence relationships can thence be derived. The influence weight versus each criterion can be derived by using DANP. The consistency between the IRM and the social theory model will be checked. Discrepancies will be derived, which can provide further insights regarding social phenomena. Below, the methods will be introduced. The three data analytic techniques, namely, topic modeling, hierarchical cluster analysis, RF, and DANP methods, will be introduced in the following subsections. These methods will be used to derive data from social media sites, derive latent topics, cluster these topics into theoretical frameworks, derive feature importances, and then feed these feature importances into DANP to derive meaningful implications. The proposed process consists of the following five steps (see Figure 1 below):

Text Mining, Topic Model and LDA
Text mining was first proposed by Fledman et al. [28]. The term refers to the procedure of retrieving high-quality information from text, which includes structured, semistructured, and unstructured text resources such as documents, videos, and images [29]. Text mining involves the extraction of information from text and the retrieving of text to derive rules and patterns [30]. Text mining also provides methods for analyzing and contextualizing massive volumes of information [31]. This, fundamentally, involves a quantitative method for analyzing (usually) big textual data; the techniques help accelerate knowledge discovery by drastically enhancing the amount of data to be analyzed [32].
One of the most popular methods of text mining is topic modeling. The method can effectively and systematically analyze many documents in a very short period of time. Among the topic modeling techniques, LDA [33], which is grounded on statistical distributions, is the most widely adopted. The basic assumption of LDA is an exchange among words and documents in a corpus, a bag of words. LDA recognizes semantically correlated words that appear at the same time in numerous documents in a corpus. After that, the topics of the words are inferred by humans as meaningful subjects. For example, the LDA assigns "gene," "DNA," "genetic," and "genetic" to topics that are interpreted as "genetic" [34].
Following, we define the terms and formulate the probabilistic model of a corpus based on the original definitions by Blei et al. [33]. A corpus D is defined as a collection

Text Mining, Topic Model and LDA
Text mining was first proposed by Fledman et al. [28]. The term refers to the procedure of retrieving high-quality information from text, which includes structured, semistructured, and unstructured text resources such as documents, videos, and images [29]. Text mining involves the extraction of information from text and the retrieving of text to derive rules and patterns [30]. Text mining also provides methods for analyzing and contextualizing massive volumes of information [31]. This, fundamentally, involves a quantitative method for analyzing (usually) big textual data; the techniques help accelerate knowledge discovery by drastically enhancing the amount of data to be analyzed [32].
One of the most popular methods of text mining is topic modeling. The method can effectively and systematically analyze many documents in a very short period of time. Among the topic modeling techniques, LDA [33], which is grounded on statistical distributions, is the most widely adopted. The basic assumption of LDA is an exchange among words and documents in a corpus, a bag of words. LDA recognizes semantically correlated words that appear at the same time in numerous documents in a corpus. After that, the topics of the words are inferred by humans as meaningful subjects. For example, the LDA assigns "gene," "DNA," "genetic," and "genetic" to topics that are interpreted as "genetic" [34].
Following, we define the terms and formulate the probabilistic model of a corpus based on the original definitions by Blei et al. [33]. A corpus D is defined as a collection of M documents. The number of words belonging to any one document d in the corpus is N d , where d ∈ {1, · · · , M}. The LDA algorithm models the corpus according to the below generative process based on the original definitions by Blei et al. [33] and Jelodar et al. [35]: (a) Select a multinomial distribution ϕ t p for the topic t p (t p ∈ {1, · · · , T}) from a Dirichlet distribution with parameter β. (b) Select a multinomial distribution θ d for document d (d ∈ {1, · · · , M}) from a Dirichlet distribution with parameter α.
where α is the per-document topic distributions; β is the per-topic word distribution; θ d is the topic distribution for document d. θ d is the topic distribution for the document d.
In the above mentioned generative process, the words in the documents are observed variables while the others are latent variables (ϕ and θ d ) and hyper parameters (α and β). The probability of observed data (D) is computed and obtained as follows in Equation (1): where z d η is the topic for the η-th word in document d and w d η is the specific word. Based on the above definitions, the probability of observed data will be derived using the LatentDirichletAllocation in the sci-kit learn Python toolkit [36].

The RF Technique
The RF method was proposed by Breiman [37] in 2001. It has been particularly effective as a classification and regression method. RF-based methods integrate some randomized decision trees and calculate the averages of predictions of these decision trees. These methods have demonstrated outstanding performance when the number of variables is much more than the number of observations [38]. Furthermore, the RF can be applied to large-scale problems, and can easily be modified to classify numerous arbitrary learning tasks by returning variable importance [38].
Based on the work of [39], the variable importance of a RF can be defined as follows. Assume a set V = x 1 , · · · , x p of categorical input variables and a categorical output y. Given a training sample S of n joint observations of x 1 , · · · , x p , y drawn from P = x 1 , · · · , x p , y , let us define for any internal node t of a decision tree built from S: The number of training samples in t as n t ; The ratio of training samples in t as p r (t) = n t /n; The impurity of node t as i p (t) = H(y t) ; The impurity reduction at node t as ∆i p (t) = i p (t) − (n tL /n i p (t L ) − (n tR /n i p (t R ), where subscripts L and R are the left node and right node of the node t. In an ensemble of decision trees, the MDI importance of an input variable x m is the sum of the weighted impurity reductions p r (t)∆i(t), for all nodes t where x m is used, calculated as the averaged of all n t trees in the ensemble: where T S is a tree structure representing an input-output model and v(t) is adopted to split node t [39]. A completely established, fully randomized decision tree is one in which every single node t is divided by means of a variable x i RF selected uniformly at random (from among those nodes which have not been used at the parent nodes) into ℵ i RF sub-trees (i.e., one for every possible value of ℵ i RF ); the recursive construction ends when each one of the p variables has been used along the present branch [39]. The MDI importance of x m ∈ V for y as computed with an infinite ensemble of fully developed totally randomized trees and an infinitely large training sample is: where V −m denotes the subset V\{x m }, P k r (V −m ) is the set of subsets of V −m of cardinality k r , and I(x m ; y|B) is the conditional mutual information of x m and y given the variables in B [39]. For any ensemble of fully developed trees in asymptotic learning sample size conditions we have x i ∈ V is irrelevant to y regarding V if and only if its infinite sample size importance, as computed with an infinite ensemble of fully developed totally randomized trees built on V for y, is 0 [39].
Let V R ∈ V be the subset of all variables in V that are relevant to y with respect to V. The infinite sample size importance of any variable x m ∈ V R as computed with an infinite ensemble of fully developed totally randomized trees built on V R for y is the same as its importance computed in the same conditions by using all variables in V [39].
Based on the above definitions, for every topic being derived in Section 3.1, the feature importance of all other topics will be derived using the RF algorithm with the RandomForestRegressor in the sci-kit learn Python toolkit [36]. The feature importance matrix will be transformed into the initial influence matrix of the DEMATEL, which will be introduced in the following Section 3.3.
The feature importance matrix M F is defined as follows. In each column, the criteria importance will serve as the influence degree from a topic to some other specific topic. Further, each column of the transposed matrix will be normalized by the maximum element of the column. Then, every element will be multiplied by 5 for consistency with the Liker's 5-point scale adopted in later methods.
For each column, the largest element of the column, namely l j f , will be used to normalize the elements belonging to that column. Then, to be consistent with the Liker's 5-point scale, the normalized result will be multiplied by 5 as ω i ω j ω of the Ω matrix below. .
The basic DEMATEL formulas, by Tzeng and Huang [50], Yang et al. [40], and Huang et al. [47] are explained in the following procedure. First, the initial direct relation matrix (IDRM) can be formulated. Based on the Ω matrix being derived by the RF, the influence of topic i d on topic j d , denoted as a i d j d in the IDRM, will be equal to ω i d j d in the i d th row and the j d th column. Thus, Here, the row and column numbers equal to the number of topics T. Then, the IDRM will be normalized by multiplying the IDRM with a factor ρ using the Equation (7) below, i.e., N R = ρA, where the maximum row sum and the maximum column sum can be selected and ρ is equal to the smaller of the reciprocal of both numbers. That is, Then, the total relation matrix (TRM), T R = t i d j d T×T , can be derived as: Then, the row sum and column sum vectors of the TRM can be derived as r and c, respectively. The causal diagram or the IRM of all the aspects and topics can be derived by demonstrating the influence relationships, where r i d + c i d and r i d − c i d represent the horizontal and vertical axis of the topic.

The DANP
The DANP is an analytic method that integrates DEMATEL and the ANP proposed by Prof. Gwo-Hshiung Tzeng [11,12]. Traditionally, the ANP requires a pre-defined structure of the decision-making problem. Thus, decision makers may introduce the structure based on the IRM being derived by DEMATEL (refer to [41] for a typical example) or by other analytic methods. However, such work usually requires two or more iterations of collecting questionnaires, which wastes time and can be complicated. Respondents to the first iteration questionnaire may refuse to provide opinions for the second iteration questionnaire, which usually causes problems of inconsistency. Moreover, due to the complicated IRM derived by DEMATEL, a threshold value is usually required to screen the most important influence relationships inside the TRM. However, such screening usually filters out a lot of connections in the TRM. To overcome such limitations, the DANP feeds the IRM by DEMATEL into the ANP. By leveraging the super-matrix being proposed by Saaty in the ANP [21], the influence weights can be derived based on following procedures.
Based on the TRM (T R ) derived in Section 3.3, the influence weights versus each topic can be derived by using the DANP method according to [42]. Let T C be equal to the transposed matrix of the TRM, i.e., T C = T t R . The TRM can be divided into m s submatrices according to the topics belonging to the aspects. That is, T C = T C i S j S m S ×m S . The submatrices can be denoted as T C i S j S = t i u j v i n i i n j , where 1 ≤ i u ≤ i n i and 1 ≤ j v ≤ i n j .
Here, n i and n j are the numbers of topics which belong to the i S th aspect, D i S , and the j S th aspect, D j S , respectively. Then, each column of T C i S j S should further be normalized by d j n j = i n i ∑ i=i 1 t i n i j n j , j n j = 1, · · · , i n i . The normalized T C i S j S can thus be expressed as Π θ e . Detailed explanations of the above process can further be found in [47]. The global priority vectors can be derived accordingly, along with the weights associated with each topic and aspect.

Empirical Study
This section presents a four-step procedure for social media mining and derivations of the criteria importance using the RF method, and the derivations of the influence relationships using the DEMATEL and the DANP. In this study, the psychological factors that can influence Taiwanese users' attitudes toward air pollution adaptation strategies were investigated. One of the major Taiwanese social media sites, the Dcard (dcard.tw), was mined to retrieve related posts. The topic modeling algorithm was then used to retrieve important topics from the social media data. After that, the topics were clustered according to their probability. The clusters were reviewed and then, based on the topics being associated with meaningful names, users' attitudes were assigned. Then, the feature importance of the topics was derived. Each topic served as the dependent variable in one analysis, while the rest of the topics served as the independent variables. The feature weights associated with the independent variables were derived. After normalization and transformation of these normalized feature weights into a five-point Likert scale, these feature weights served as the input for the DEMATEL as well as the DANP. The IRM and the influence weights were derived accordingly.

Scraping and Pre-Processing of Social Media Data
At first, Dcard (dcard.tw) a popular website with 4 million users that accounts for around one sixth of the weekly social media posts in Taiwan, was used to mine users' opinions regarding the air pollution problem in the country. Air pollution is one of the most serious and concerning environmental issues in emerging economies in general, and in Taiwan in particular. A total of 3700 messages related to air pollution were retrieved using the Application Programming Interface (API) of Dcard in September, 2020. However, some of these messages could be dated back to 2016. The posts were collected from a number of boards, including Mood, Chats, Science, News, Beauty, Life, etc. Since the posts being retrieved from Dcard were full of information unrelated to the analyses and included tremendous inconsistencies in the data, they were pre-processed and cleaned. After unrelated posts were removed, 1043 messages were left for further analyses. Punctuation, common stop words, infrequent words, duplicates, errors, and messages unrelated to air pollution were removed from the full texts using a program the authors coded in Python 3.7 [9].

Extracting the Main Topics Using the LDA methods
After the texts were cleaned, the LDA topic modeling method introduced in Section 3.1 was adopted to retrieve topics from the posts. The parameters were estimated after 1000 iterations of Gibbs sampling, using 12 topics for our data set. Based on the LDA, 12 topics with coherent groups of keywords (Table 1), which clearly described the associated meanings, were named by four environmental experts [9]. The 12 topics were fuel (t 1 ), masks (t 2 ), electronic cigarettes (e-cigarettes) (t 3 ), smoking (t 4 ), coal-fired power generation (t 5 ), refuse combustion (t 6 ), power generation (t 7 ), policy ambiguity (t 8 ), climate change (t 9 ), wind power generation policy (t 10 ), allergies and health (t 11 ), and air purifiers (t 12 ).  3  3  3  3  3  3  3  3  3  3  4  4 Based on LDA, the per-document topic assignments z d η , and topic proportions θ d are conducted. Each message (document) was assumed to have a mix of latent topics, and each topic was assumed to have a certain probability of occurring in the document. A documenttopic matrix represented the relationship between document and topics. Each row in the matrix stood for a document and each column for a topic. An entry was the number of distribution probabilities of the document in the topic. The authors first normalized and standardized the document-topic matrix, and then used the quartile deviation to group the distribution probability. The lowest 25% of the document-topic matrix was defined as "1," the 25% to 50% portion was defined as "2," 50% to 75% defined as "3" and higher than 75% as "4" (see Table 1). The five highest probability terms in the top identified topics from the LDA topic modeling are summarized in Table 1. Then, the scales are normalized and transformed to Liker's 5-point scale for consistency with later methods.

Merging Similar Topic Using the Hierarchical Cluster Analysis
After the derivations of topics, the topics are classified further by using the hierarchical cluster analysis. Based on the results of cluster analysis, the topics were categorized into four clusters by using the SPSS statistical software (version 21.0), where the squared Euclidean distance was adopted to calculate dissimilarities between the clusters. (Refer [43] for the detailed analytic process.) Then, according to the features of the topics, the four clusters are labeled as egoistic concerns (EC), altruistic concerns (AC), biosphere concerns (BC), and adaptation strategies (AS), the four aspects of the value-belief-norm theory being proposed by Stern et al. [14] (refer Table 2).

Derivation of Feature Importance by Using the RM algorithm
Based on the results of topic modeling (see Table 1), for each topic, the feature importance of the other 11 topics was derived using the RandomForestRegressor in the Sci-Kit Learn Python toolkit [36]. For example, for the first topic (t 1 ), the feature importance of the other 11 topics was filled into the first column of the matrix M F (see Table 3) by using Equation (5) in Section 3.2. For the second topic, (t 2 ), the feature importance of the other 11 topics was filled into the second column of the matrix. The same rule was applied to the rest of the topics. The largest element in each column was used to normalize the elements belonging to that column. Then, to be consistent with the definition of the IDRM of DEMA-TEL, the normalized result was multiplied by 5 to create the Ω matrix using Equation (6) ( Table 4 below). By calculating the average of the scores of the topics associated with any one post belonging to some specific aspect, the feature importnce matrix M F a and the Ω a of aspects could derived using the same approach by Equations (5) and (6). Since the aspect of biosphere concerns contained only two criteria, the RF and the DEMATEL were not applicable to most of the cases. Accordingly, the two topics belonging to the aspect of biosphere concerns were denoted as BC 1 and BC 2 , respectively. These two matrices are demonstrated in Table 5 andTable 6 below.

Deriving the Influence Relationships/Weights Using DEMATEL and DANP
Based on the Ω matrix being derived by the RF, the influence of topic i d on topic j d , denoted as a i d j d in the IDRM, will be equal to ω i d j d in the i d th row and the j d th column. Thus, A = Ω. By adopting the process introduced in Section 3.3, the TRM can be derived as shown in Table 7. Then, the row sum and column sum vectors of the TRM can be derived as r and c respectively in Table 8. The TRM of all the aspects as well as the r i d + c i d and r i d − c i d versus each aspect are demonstrated in Tables 9 and 10, respectively. The IRM is demonstrated in Figure 2. Further, the influence weights versus each topic and aspect can be derived according to the procedure outlined in Section 3.4. The results are demonstrated in Tables 8 and 10 respectively.  The influence relationships from egoistic concerns to adaptation strategies are consistent with past works. The adaptation strategy is a response strategy to environmental problems in general, and the air pollution problem in particular [46]. Adaptation strategies can provide possible adaptation plans/actions to facilitate the adjustment of human society and ecological systems to address environmental disasters by increasing a system's ability or reducing its vulnerability [51]. Effective adaptation strategies are vital for the long-term success of an organization [46]. Egoistic concerns are expressed as functional benefits and emotional benefits [52]. A person with egoistic concerns seeks individual economic benefits and emotional benefits [52]. Individuals with higher egoistic concerns will particularly think about the expenses and advantages of an environmental behavior for themselves [53]. Because air pollution is a local environmental problem that directly influences personal welfare, people may adopt adaptation strategies for individual benefit. According to the earlier work by the authors [9], egoistic concerns have significant correlations with adaptation strategies toward air pollution problems. When egoistic concerns are higher, more people are directly concerned with specific local environmental issues that directly impact them, rather than being stressed by global problems such as climate change [54]. We believe that people may adopt adaptation strategies for air pollution if air pollution problems are anticipated to influence the benefits of themselves. Based on the influence relationships being derived, i.e., EC→AS, people will adopt adaptation strategies such as supporting wind power generation policies ( 10 t ), taking medical treatment ( 11 t ), and purchasing air purifier products ( 12 t ). The influence relationships from altruistic concerns to adaptation strategies are also consistent with past works. Altruistic concern is a willingness to take action even in the face of the free rider problem [14], which means that individual self-interest is not sufficient to produce a collective good [55]. According to Stern et al. [14], although some people

Discussion
In this work, a novel analytic framework, which consists of social media mining, RF, and MCDM techniques, was proposed. Further, the Taiwanese social media platform, Dcard, was used to retrieve data and validate the feasibility of the analytic framework. Meanwhile, influence relationships and influence weights were derived using the novel analytic framework. In the following section, the theoretical implications and advances in research methods presented in this study will be discussed.

Theoretical Implications
First, the mutual influence relationships among the three aspects from the VBN theory, i.e., altruistic, egoistic, and biosphere concerns, will be discussed. Based on the analytic results, the altruistic concerns influence both the egoistic and biosphere concerns. Furthermore, the biosphere concern influences the egoistic concern. The influence relationships are fully consistent with the original theoretical framework proposed by Stern et al. [14], which argues that the three environmental concerns-egoistic, altruistic, and biosphere-are mutually correlated. Environmental concern is the extent to which individuals are conscious of environmental issues and/or harms and support efforts to resolve those problems and/or point out an intention to contribute to the solution themselves [44]. According to Helm et al. [45], the three aspects are highly correlated. The less important influence relationships from egoistic concerns to biosphere concerns were not demonstrated in the IRM. This may be due to the lower value of total influence from egoistic concerns to the BC 1 aspect; thus, the influence was not demonstrated in Figure 2. The possible reason for this phenomenon may be the separation analysis of BC 1 and BC 2 aspects, which is limited by the infeasibility of deriving correct DEMATEL results based on the feature importance derived by using the RF algorithm, when there is only one dependent variable and one predictor. The unity feature importance derived will finally cause an IDRM with the same elements, for example, [5] 2 × 2 in this case, where correct results cannot be derived by DEMATEL.
The influence relationships from egoistic concerns to adaptation strategies are consistent with past works. The adaptation strategy is a response strategy to environmental problems in general, and the air pollution problem in particular [46]. Adaptation strategies can provide possible adaptation plans/actions to facilitate the adjustment of human society and ecological systems to address environmental disasters by increasing a system's ability or reducing its vulnerability [51]. Effective adaptation strategies are vital for the long-term success of an organization [46]. Egoistic concerns are expressed as functional benefits and emotional benefits [52]. A person with egoistic concerns seeks individual economic benefits and emotional benefits [52]. Individuals with higher egoistic concerns will particularly think about the expenses and advantages of an environmental behavior for themselves [53]. Because air pollution is a local environmental problem that directly influences personal welfare, people may adopt adaptation strategies for individual benefit. According to the earlier work by the authors [9], egoistic concerns have significant correlations with adaptation strategies toward air pollution problems. When egoistic concerns are higher, more people are directly concerned with specific local environmental issues that directly impact them, rather than being stressed by global problems such as climate change [54]. We believe that people may adopt adaptation strategies for air pollution if air pollution problems are anticipated to influence the benefits of themselves. Based on the influence relationships being derived, i.e., EC→AS, people will adopt adaptation strategies such as supporting wind power generation policies (t 10 ), taking medical treatment (t 11 ), and purchasing air purifier products (t 12 ).
The influence relationships from altruistic concerns to adaptation strategies are also consistent with past works. Altruistic concern is a willingness to take action even in the face of the free rider problem [14], which means that individual self-interest is not sufficient to produce a collective good [55]. According to Stern et al. [14], although some people will possibly anticipate sufficient individual advantages or benefits to rationalize provision of the collective good on egoistic grounds, most are also inspired by a more extensive, altruistic concern. Altruistic concern is a willingness to take action even in the face of the free rider problem [14], which means that individual self-interest is not sufficient to produce collective good [55]. Previous studies show that altruistic concerns may lead people to experience environmental stress and coping and then engage in pro-environmental activities [45]. Based on past works, altruistic concerns impact clients' purchase intentions regarding ecologically-friendly products [56]. According to the IRM in Figure 2, AC→AS, which means the influences from altruistic concerns are very important for the development of adaptation strategies. From the topics belonging to altruistic concerns, coal-fired power generation (t 5 ) and refuse combustion (t 6 ) are more important issues of concern to Taiwanese people. These air pollution-related problems influence consumer behavior toward purchasing air purifiers (t 12 ; 9.260%) and taking medical treatment (t 11 ; 8.855%). Though adopting wind power generation (t 10 ; 7.053%) is an alternative for reducing the threats caused by air pollution, the replacement of coal-fired or gas-fired power generation plants with green power needs long-term planning over many years. Therefore, wind power generation (t 10 ; 7.053%) is the least important strategy from Taiwanese social media users' perspective.
The influence relationship from biosphere concerns to adaptation strategies is also consistent with past works. Bio-spheric values reflect an individual's concerns/perception regarding the biosphere and highlight the quality of the natural environment, distinctly from its benefits to humans. Several studies have found that bio-spheric concerns are connected with pro-environmental behavior intention. According to Helm et al. [49], individuals with more bio-spheric concerns (for example, concern for living creatures and the environment) related to concerns about harmful impacts for all animals and plants on Earth might value the risks of climate change as more severe and stressful, and therefore will probably respond to them [57]. Thus, bio-spheric environmental concern is dominant in affecting psychological adaptation [45]. Nguyen et al. [58] pointed out that biosphere values stimulate active involvement in ecological consumption by enhancing clients' attitudes toward environmental protection and reducing problems related to environmentally-friendly products. Based on the work by Kiatkawsin et al. [59], bio-spheric values have more impact on customers' chances of purchasing sustainable merchandise. According to the IRM in Figure 2, the BC 1 (policy ambiguity) has more influence on the adaptation strategies than the BC 2 (climate change). The answer is very reasonable. First, based on the recognition of social media users, the influence of policy ambiguity (BC 1 ) is indeed stronger than that of climate change (BC 2 ). The terms associated with the only criterion (t 8 ) in BC 1 , including the terms associated with the topic (green, nuclear, vote, government in Table 2), are those which have more influence on wind power generation policy (t 10 ). The stronger influence relationship can be observed from the TRM of topics in Table 7. The influence from t 8 to t 10 (0.195) is indeed much higher than the influence from t 8 to t 11 and t 12 , which are 0.061 and 0.066, respectively. Further, the influence of climate change (t 9 ) on the three criteria in the AS aspect is 0.088, 0.041, 0.039, respectively. This means that policy ambiguity (BC 1 ) is indeed the major topic influencing the definition of wind power generation (AS).
Finally, according to the result of the DANP in Table 10, the influence weight for environmental concerns and adaptation strategies are prioritized as EC AS AC BC 1 BC 2 . Many environmental issues are considered social dilemmas; that is, when individuals pursue their own self-interest, this results in damaging consequences for the collective. For example, Knes [60] proposed that promoting pro-environmental behavior is recognized as a moral issue by altruistic individuals but not by egoistic ones in the context of climate change. However, our study proposes that egoist concerns have a greater influence weight than altruistic and bio-spheric concerns in the context of air pollution. This may be why air pollution is one of the most pressing environmental and health issues, which can cause respiratory illnesses and allergies ranging from coughs to asthma, cancer, or emphysema. Related research by Vyver et al. [61] revealed that people who perceived higher health threats were also more likely to engage in a range of pro-environmental behaviors in the case of turning off idling engines to reduce air pollution.

Advance in Research Method
The analytical framework which integrates the method of NLP, RF, and MCDM is a novel one which crosses the gap between social media mining and MCDM research. Numerous scholars have developed works using these methods individually. Very few scholars have tried to integrate the NLP methods with SEM. However, according to the authors' limited knowledge, this work is the first which tries to integrate these methods and derive meaningful results.
First, the RF algorithm can transform data retrieved from any database into the IDRM, which is required by DEMATEL. Traditionally, the MCDM method required opinions to be provided by experts. However, data retrieved from the database or the mass population (i.e., big data) can also provide very meaningful information. Thus, scholars have started to propose method(s) which tried to integrate the RF algorithm and the MCDM method, like DANP (e.g., the work by Liu et al. [62] and Lo et al. [63]), which provide insights into management problems based on real data. In this paper, the NLP-based social media mining techniques are further integrated and advance the existing RF and DANP-based method. Big data retrieved from social media can serve as the basis for uncovering social phenomena by using MCDM methods, which were difficult to achieve. However, the influence relationships can provide more meaningful information than traditional MCDM or statistical methods-based research.
Second, the social media mining-based MCDM framework can provide more insights into social phenomena or social theories. Traditionally, scholars used statistical sampling-based methods such as covariance-based SEM or PLS-SEM to verify the theoretical framework. The social media mining-based MCDM framework provides new opportunities for verifying causal relationships and deriving new influence relations and the importance of aspects belonging to the theoretical frameworks.
In general, the proposed analytical framework advances both the MCDM-based analytical framework and the methods for verifying social theories. The analytical framework can be further adopted in big data analytics, uncovering real problems and confirming social theories by using big data.

Limitations and Future Research Possibilities
From the aspect of limitations, the analytic results are derived based on the Taiwanese social media site. The results may be controversial when mining social media sites from other regions or economies. Meanwhile, the empirical results are based on the VBN theoretic framework. Whether the analytic framework can derive satisfactory results, which can be fully consistent with other social theories, is worth future study.
Further, as already mentioned in Section 5.1, when the number of criteria of some specific aspect is less than three, the RF based DANP may not be feasible. The unity feature importance will cause an IDRM with same elements, for example, [5] 2 × 2 . In this case, correct results cannot be derived by DEMATEL. Though this kind of situation will not really occur in research which refers to prior academic works, e.g., the confirmatory analyses based on SEM, which usually contain more than three to five criteria based on the questionnaires, the phenomenon actually constrains the development of some MCDM problems containing aspects with fewer than three criteria.
In the future, the novel analytic framework consisting of social media mining, RF, and MCDM methods can be used to retrieve more information from social media websites in general, and validate social theories regarding social phenomenon in particular. The newly derived influence relationships between altruistic and egoistic concerns and altruistic and biosphere concerns are also worth further research and investigation.

Conclusions
During the past decade, social media has emerged as one of the major sources for mining opinions from users in major and emerging economies. Though numerous scholars and practitioners have dedicated attention to mining useful information from social media, a lot more can be retrieved from the available data. The MCDM theories and methods have been well developed and widely applied to numerous economic, management, and engineering problems. However, very few scholars have tried to integrate the MCDM method with social media mining techniques. However, interesting results, such as influence relationships and valuable insights, can be retrieved from social media data. Thus, the authors proposed an analytic framework that integrates the LDA, RF, DEMATEL, and DANP. In this study, Dcard users' attitudes and adaptation strategies regarding air pollution problems were retrieved and analyzed based on the value-belief-norm theory proposed by Stern et al. [14].
Based on the analytic results, the influence relationships are fully consistent with the value-belief-norm theory. That is, altruistic concerns influence both egoistic and biosphere concerns. Furthermore, biosphere concerns influence egoistic concerns. Moreover, all three aspects-altruistic, egoistic, and biosphere concerns-influence adaptation strategies. The mutual influences between altruistic concerns and egoistic concerns, as well as altruistic concerns and biosphere concerns, were seldom discussed in past works. Whether these two influence loops are self-enhancing or self-attenuating is worth investigating further.
According to the results derived by the DANP, the most important aspects of the analytic framework include egoistic concerns and altruistic concerns, which had influence weights of 31.613% and 24.394%, respectively. The results are fully consistent with the authors' earlier work using the PLS-SEM to analyze the VBN theoretic framework [9], in which these two aspects were the ones most closely correlated with the adaptation strategies. That is, the influence relationships are consistent with statistical results.
The analytic results presented here were derived based on the Taiwanese social media site Dcard. The results may be controversial when mining social media sites from other regions or economies. Meanwhile, the empirical results were based on the VBN theoretical framework. Whether this analytic framework can derive satisfactory results that can be fully consistent with other social theories is a question worth further study. In the future, this novel analytic framework can be used to retrieve more information from social media websites in general, and validate social theories regarding social phenomenon in particular.  Institutional Review Board Statement: Not applicable. The study did not involve humans.

Informed Consent Statement: Not applicable.
Data Availability Statement: The data are not available because of ongoing studies.

Acknowledgments:
The authors appreciate Yu-Sheng Kao for his initial discussion of the research ideas regarding to the analytic framework. Further we would thank Kao for his valuable opinion regarding to revising partial of the draft.

Conflicts of Interest:
The authors declare no conflict of interests.