Climate Change Communication in an Online Q&A Community: A Case Study of Quora

: An emerging research trend in climate change studies is to use user-generated-data collected from social media to investigate the public opinion and science communication of climate change issues. This study collected data from the social Q&A website Quora to explore the key factors inﬂuencing the public preferences in climate change knowledge and opinions. Using web crawler, topic modeling, and count data regression modeling, this study quantitatively analyzed the effects of an answer’s textual and auxiliary features on the number of up-votes received by the answer. Compared with previous studies based on open-ended surveys of citizens, the topic modeling result indicates that Quora users are more likely to talk about the energy, human and societal issues, and scientiﬁc research rather than the natural phenomena of climate change. The regression modeling results show that: (i) answers with more emphasis on speciﬁc subjects, but not popular knowledge, about climate change can get signiﬁcantly more up-votes; (ii) answers with more terms of daily dialogue will get signiﬁcantly fewer up-votes; and (iii) answers written by an author with more followers, with a longer text, with more images, or belonging to a question with more followers, can get signiﬁcantly more up-votes.


Introduction
Currently, an emerging research agenda is to use social media data to analyze the public opinion on climate change issues. Public perception of the existence of climate change and its impacts on the environment and society is an important issue with societal and political implications [1]. Public supports are also crucial for legislation and the implementation of mitigation policies to climate change [2]. However, although there is a consensus in the scientific community that climate change is mainly caused by human activities and is already having significant negative impacts on the environment and society [3], many studies report the lack of agreement within the general public that anthropogenic climate change is occurring [4]. This discrepancy between the scientific community and the general public on climate change is actually a science communication problem, which has stimulated many scholars to investigate the key factors influencing public attitudes or preferences on topics related to climate change [5].
Public opinion analytics is essential for a better understanding of the social environment and the dynamics of social changes. Among various sources of public opinion data, social media data is attracting great attention from researchers, as it provides highly valuable data about the public attitudes and opinions on controversial social events [6] and has been widely used to monitor and analyze public responses to natural or social phenomena [7]. For social science research, the nature and given name, in registration. Although this requirement is not mandatory, it has constructed a real-name environment in Quora, reinforcing the representativeness of Quora's data to reflect the public opinion. In addition, the ample auxiliary information, including author information, question information, and answer information, also augments the utility of Quora data for research in public opinion and science communication on climate change.
The aim of this study is to investigate the key factors influencing public preferences on climate change knowledge and opinions, with the user-generated-content data collected from Quora, particularly from the questions under the Climate Change topic in Quora. In this study, the measurement of public preference, which is always a thorny issue in traditional public opinion research [8], was naturally and quantitatively implemented by counting the up-vote number of an answer. Textual features extracted by topic modelling together with other features of each answer were integrated into a regression model to explain the influence of these features on the up-vote number of an answer. The results of the model reveal the mechanism of the science communication of climate change knowledge in social media sites, and the analytic framework in this study is expected to be widely applied as a methodological strategy in future social science studies, especially those involving online public opinion and science communication.

Data Collection
Quora is one of the most popular Q&A websites in the world. In March 2016, Quora revealed that it was seeing over 100 million monthly unique visitors to its Q&A social network, an increase of 22% from January 2016, when it reported having 80 million [28]. According to Alexa, in April 2016, Quora was ranked as the world's 128th most popular website, with most of the visitors from India (39.9%), the U.S. (23.1%), the U.K. (2.9%), Canada (1.8%), and China (1.8%) [29]. Quora does not collect or present user demographic information. However, based on the statistical data from Alexa, compared to the general internet population, Quora has more male users, more highly-educated users, and more young users [29].
Climate change is always a hot topic in Quora, with many questions and answers, providing the possibility of using quantitative approaches to analyze the public opinion and science communication of climate change. Data for this study were collected by a Python web crawler. The crawler accessed the website on 28 March 2016 and collected all accessible questions, answers, and auxiliary information under the Climate Change topic in Quora. The total number of questions under the topic was about 6800 at that time, and the latest 3400 ones, which were accessible to the public, were collected. A question can receive several answers, which are presented under the page of the question. Figure 1 presents a snapshot of a question's page in Quora. The page shows two types of information: question information and answer information. Question information includes the topic of the question, the text of the question, the number of followers of the question, view times of the question, creation date of the question, and the number of answers to the question. Answer information includes a brief introduction of the answer's author, the view times of the answer, the creation date of the answer, the up-vote number of the answer, and the text of the answer. The data of authors' activities and social statuses in the social network were also collected by accessing the profile page of the author, as shown in Figure 2. In total, 10,432 answers were collected, written by 3434 authors, to 2929 of the 3400 questions by that day. After removing common stopwords and four custom stopwords, including "climate", "change", "global", and "warming" in answer texts, 10,393 answers remained and were used in topic modeling and regression modeling.

Regression Model for Count Data
The answer's up-vote number was used to measure the public preference for the standpoints in the answer and was the dependent variable in the regression model. As Quora is an online social Q&A website with a real-name registration system, it is logical to think that most Quora users prudently vote for an answer and the up-vote number can effectively reflect the public preference for the answer. Because the up-vote number is a count variable (Mean up-vote number = 5.63, Min up-vote number = 0, Max up-vote number = 2600, Std. Dev up-vote number = 39.12), a Poisson regression model and a Negative binomial regression model were used to analyze the data.
In this case, Y i is defined as the up-vote number that answer i (i = 1, 2, . . . , N) has received. The Poisson regression model assumes that the variable Y i is distributed as shown in Equation (1): where λ i is the mean and variance of Y i and is specified by a k-dimension vector, X i = (x i1 , x i2 , . . . , x ik ), which includes all the k explanatory variables. The most commonly used formulation is to model the natural logarithm of λ i as a linear function of the explanatory variables, as shown in Equation (2): The Poisson regression model has the constraint that the variance is equal to the sample mean. However, the sample of the dependent variable of this study is strongly skewed, which can cause over-dispersion in modeling. This feature makes the Poisson regression model unsuitable for modeling an answer up-vote number. In order to deal with the over-dispersion problem, this study employed the negative binomial regression model to model the data, which introduces a parameter to correct over-dispersion when the variance is much larger than the mean. The formulation of the negative binomial distribution is shown in Equation (3): where Γ is the gamma function and the negative binomial distribution of Y i has a mean λ i and a variance as shown in Equation (4): where φ is called the over-dispersion parameter. When φ → +∞ , the negative binomial distribution is the same as the Poisson distribution. Similar to the Poisson regression model, Equation (2) is used to link explanatory variables to the negative binomial distribution of the dependent variable. A maximum likelihood approach was used for the estimation of both models.

The Explanatory Variables
The aim of the regression analysis was to investigate the key factors influencing the up-vote number of an answer under the Climate Change topic in Quora. These key factors can be classified into two categories-namely, textual features and auxiliary features.

Textual Features
Science communication of climate change is "a complex and contentious topic that encompasses a spectrum of issues from the factual dissemination of scientific research to new models of public engagement whereby lay persons are encouraged to participate in science debates and policy" [32]. The collected answers frame the climate change issue from different perspectives, which can have implications for multiple values or considerations and thus can attract public support in different ways.
Previous studies have also shown that individuals always selectively view and interpret information in ways which reinforce their already held beliefs [33,34]. Hence, different frames in an answer can influence the public preference on the answer, which is reflected by the answer's up-vote number.
From a formative perspective, frames are constructed based on a coherently semantic structure of particular shared meaning [35]. Hence, it is possible to identify major frames by analyzing discriminating terms and their clustering in the text. Previous studies have demonstrated that computer-assisted text analysis methods can efficiently detect frames in a large corpus [36,37]. In this study, structural topic modeling (STM) [38] was used to extract major frames (topics) in the whole corpus and of each answer text. In STM, a corpus can be summarized as several topics. A topic is a distribution on a vocabulary, and a text is a distribution on topics. For example, an answer text related to the natural phenomena of climate change has two topics, including "Climate Change" and "Natural Phenomena". The "Climate Change" topic has words related to climate change, such as "climate", "change", "global", "warming", and "earth", with high probabilities. Meanwhile, the "Natural Phenomena" topic has words related to extreme natural phenomena, such as "flood", "drought", "glacier", "sea-level", and "rise", with high probabilities.
Structural topic modeling is a highly automated approach. The only parameter needing to be determined in the model is the number of topics. Hence, multiple models with different numbers of topics were built to select a preferable one. As there is no acknowledged indicator for choosing the best model with the highest semantics, the choice relied more on qualitative analysis, which was based on authors' inspection of the most frequent terms and close reading of the most representative texts of each topic to attain our preferred model. Finally, a ten-topic model was selected. Each topic in the ten-topic model was manually given a specific label to describe its practical significance. Appendix A presents sample results of alternative model specifications containing four, eight, and twelve topics.
Based on the results of structural topic modeling, a 10-dimensional vector representing the topic distribution for each answer text was obtained. However, this 10-dimensional vector cannot be directly used in regression modeling because of multicollinearity (for any answer text, the sum of the components in the 10-dimensional vector is always one). Hence, topic proportions were transformed into dummy variables. That is, a value of one was assigned to a dummy variable if the corresponding topic proportion was no less than 0.2. This threshold was chosen because it is double the average topic proportion of the 10 topics and it ensured that 99% of answers in the corpus could be explained by at least one topic [39]. The coefficient of each dummy variable indicated its effect on the up-vote number of an answer, compared with those answers that did not have such a prominent topic.

Auxiliary Features
In addition to the textual features, some auxiliary features related to answers were also included in the regression model. First was the follower number of the answer's author. Answers written by Quora users are presented in their followers' homepages, and are subsequently easily read by their followers. Additionally, in a knowledge sharing social networking website like Quora, followers can to some extent be regarded as fans of the followee, and are more likely to agree with and vote for the opinions of the followee [40]. Hence, more followers can bring more up-votes to the answers written by the followee. Second is the text length in terms of the number of characters. A longer answer may provide more details for the readers to be enlightened and to vote for it [41]. Third is the number of images in an answer. Images can increase the likelihood of understanding a message by providing more vivid and comprehensive information [42]. It is a common strategy for authors to use images in their articles to amuse readers in social networking websites. Hence, it can be expected that answers with more images will get more up-votes. Last but not least is the follower number of the question. More followers of a question mean that the question attracts more attention from Quora users. Thus, the answers under the question can also gain more exposure, and subsequently may get more up-votes. The number of days from the creation of the answer to the date of data collection was used as an offset to account for the time effect. Table 1 provides a summary of these auxiliary features. All these auxiliary features were scaled before modeling. In order to test for multicollinearity of all the explanatory variables, the variance inflation factor (VIF) was computed using the method of Davis et al. [43], which is based on the correlation matrix from the information matrix of the variables. In the Poisson regression model, the VIF values of explanatory variables were less than 1.81, and in the negative binomial regression model, the VIF values of explanatory variables were less than 1.39. These values show that little multicollinearity existed among all explanatory variables.

Major Topics in Answers
As indicated above, through manual inspection of the details of the established models, a structural topic model with 10 topics was found to be preferable in terms of both semantic coherence and exclusivity, compared to those with more or fewer topics. The selected 10-topic model is shown in Table 2, with the 15 most frequent terms, the proportion in the whole corpus, and the manually proposed label of each topic. Figure 3 presents the word clouds with the 25 most frequent terms for each topic to make the result easier to read and interpret.  Most topics in Table 2 apparently refer to the commonly discussed subjects related to climate change. For example, Topic 1 contains high-frequency terms such as "carbon", "fuel", "burn", "dioxid-", and "emiss-", and clearly pertains to the fuel and carbon issue, which is always regarded as the major cause of anthropogenic global warming [44]. Thus, Topic 1 is labeled as "Fuel/Carbon". Topic 4 contains similar terms as Topic 1, but is labeled as "energy", as the high-frequency terms in Topic 4 (i.e., "energi-", "power", "cost", "develop", and "renew-") reflect that this topic focuses on a more macro level than Topic 1 does. These two topics are relevant to energy and fuel, and account for 18.8% of the whole corpus.
Three topics, including Topic 3 (Human/Biodiversity), Topic 5 (Atmosphere/Weather), and Topic 7 (Hydrosphere), show more relevance to the influences of climate change on different aspects, including water, air, and species. They have a total proportion of 29.1%. Meanwhile, Topic 6 and Topic 9, together accounting for 21.2% of the corpus, focus on more societal issues relevant to climate change, including science communication and politics. Topic 10 discusses details of climate modeling, with many methodological terms, such as "model", "data", "predict", and "trend".
The remaining two topics, including Topic 2 (Livelihood) and Topic 8 (Future), contain highfrequency words that are commonly used in daily dialogues, rather than in specific subjects related to climate change. The two topics account for 22.7% of the whole corpus. Table 3 presents the results of two count data regression models: the Poisson regression model and the negative binomial regression model. The following measures of fit were employed to quantify the model fit: Log likelihood, Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). The results in Table 3 show that the negative binomial regression model fit the data better than the Poisson regression model, with a higher Log likelihood, a lower AIC value, and a lower BIC value. Hence, the following interpretations are mainly based on the results of the negative binomial regression model. These measures also indicate that the observed count data (i.e., the up-vote number of answers related to climate change in Quora) do have an over-dispersion problem. The estimated value of the over-dispersion parameter φ , described in Equation (4), is 0.5584. The better performance of negative binomial regression is consistent with prior studies arguing that negative binomial regression is more useful than the Poisson model in fitting over-dispersed datasets [26,45,46]. Most topics in Table 2 apparently refer to the commonly discussed subjects related to climate change. For example, Topic 1 contains high-frequency terms such as "carbon", "fuel", "burn", "dioxid-", and "emiss-", and clearly pertains to the fuel and carbon issue, which is always regarded as the major cause of anthropogenic global warming [44]. Thus, Topic 1 is labeled as "Fuel/Carbon". Topic 4 contains similar terms as Topic 1, but is labeled as "energy", as the high-frequency terms in Topic 4 (i.e., "energi-", "power", "cost", "develop", and "renew-") reflect that this topic focuses on a more macro level than Topic 1 does. These two topics are relevant to energy and fuel, and account for 18.8% of the whole corpus.

Regression Results
Three topics, including Topic 3 (Human/Biodiversity), Topic 5 (Atmosphere/Weather), and Topic 7 (Hydrosphere), show more relevance to the influences of climate change on different aspects, including water, air, and species. They have a total proportion of 29.1%. Meanwhile, Topic 6 and Topic 9, together accounting for 21.2% of the corpus, focus on more societal issues relevant to climate change, including science communication and politics. Topic 10 discusses details of climate modeling, with many methodological terms, such as "model", "data", "predict", and "trend".
The remaining two topics, including Topic 2 (Livelihood) and Topic 8 (Future), contain high-frequency words that are commonly used in daily dialogues, rather than in specific subjects related to climate change. The two topics account for 22.7% of the whole corpus. Table 3 presents the results of two count data regression models: the Poisson regression model and the negative binomial regression model. The following measures of fit were employed to quantify the model fit: Log likelihood, Akaike Information Criterion (AIC), and Bayesian Information Criterion (BIC). The results in Table 3 show that the negative binomial regression model fit the data better than the Poisson regression model, with a higher Log likelihood, a lower AIC value, and a lower BIC value. Hence, the following interpretations are mainly based on the results of the negative binomial regression model. These measures also indicate that the observed count data (i.e., the up-vote number of answers related to climate change in Quora) do have an over-dispersion problem. The estimated value of the over-dispersion parameter φ, described in Equation (4), is 0.5584. The better performance of negative binomial regression is consistent with prior studies arguing that negative binomial regression is more useful than the Poisson model in fitting over-dispersed datasets [26,45,46]. The effect of the explanatory variable on the dependent variable is determined by the regression coefficient β shown in Table 3. In both the Poisson regression model and the negative binomial regression model, a positive (negative) estimated value of the β coefficient for an explanatory variable indicates that an increase (decrease) in the variable leads to a higher expected count of up-votes, ceteris paribus. As the two count data models fit the natural logarithm of the up-vote number, the coefficients can be interpreted as follows: for a one-unit change in an independent variable, if other variables remain fixed, the natural logarithm of the dependent variable is expected to change by the value of the estimated coefficient. As is shown in the negative binomial regression results in Table 3, four auxiliary features, including scaled Author followers, scaled Text length, scaled Image number, and scaled Question followers, were positively correlated to the number of up-votes that an answer received, all with a significance level of p < 0.001. For example, the estimated coefficient for the scaled Text length was 0.195, which means that if other variables remain fixed, answers that have a one-unit longer text length on average obtain 1.215 (exp (0.195) = 1.215) times as many up-votes as the shorter ones.

Regression Results
With regard to the textual features, based on the results of the negative binomial regression model, there were eight topics significantly influencing the public preferences of an answer. Six topics, including Topic 1 (Carbon/Fuel), Topic 3 (Human/Biodiversity), Topic 4 (Energy), Topic 6 (Science Communication), Topic 9 (Politics), and Topic 10 (Climate Modeling), showed significantly positive effects on the extent to which an answer can get more up-votes. Two topics, including Topic 2 (Livelihood) and Topic 8 (Future/Impact), showed significantly negative effects in this regard. Meanwhile, Topic 5 (Atmosphere/Weather) and Topic 7 (Hydrosphere) had no significant effect on the number of up-votes obtained by an answer.

Discussion
In the near future, public participation in environmental issues will take place primarily via the Internet, and social media sites-which provide opportunities for implementing the interactions between policy makers and common people or knowledge producers and knowledge receivers-will be the major platform for online public participation [47,48]. With regard to climate change issues, a huge volume of public opinion data is posted on social media sites at present. These data have been widely used to describe the profile of online public opinions about climate change [1,18,19,23,25]. However, more in-depth studies with predictive or prescriptive analysis are rare. This study responds to the lack of such empirical cases by highlighting the utility of the combination of the structured and unstructured data collected from Quora. The analytic framework in this study solves several conceptual and computational problems in leveraging the data, including using the number of up-votes to measure public preferences on certain standpoints, employing the Poisson regression model and the negative binomial regression model to fit the count data and to transform unstructured text data into topical features which can be used in a regression model. The proposed framework is expected to be widely applied in future social science studies which intend to leverage the big data from social media sites.
In addition to the methodology's significance, the results of topic modeling and regression modeling on the Quora data also have implications for better understanding the science communication and the public opinion on climate change: The topic modeling results summarize the online public opinion on climate change in Quora, which is one of the most popular Q&A websites in the English world. The induced 10 topics distribute quite evenly in the whole corpus, with the most prevalent topic being Topic 2 (Livelihood), accounting for 12.4% in whole corpus, and the least prevalent topic being Topic 4 (Fuel/Carbon), accounting for 7.8%. Most of these topics also appear in previous studies based on open-ended surveys of citizens in the U.S. and the U.K. aiming to find effective images associated with global warming or climate change [49], however, with different proportions. For instance, natural phenomena related to climate change, such as ice melt, flooding, and abnormal weather, are prominent topics or effective images of citizens in the U.S. and the U.K. However, similar topics in Quora answers, including Topic 5 (Atmosphere/Weather) and Topic 7 (Hydrosphere), account for just 20.9% of the corpus. The proportion of topics focusing on energy and fuel and carbon emission issues (Topic 1 and Topic 4) is 18.8% in Quora answers, clearly larger than the proportions of the Greenhouse category in the U.S. and the U.K. (both less than 5%) [49]. In addition, human and societal topics, including Topic 2 (Livelihood), Topic 3 (Human/Biodiversity), Topic 8 (Future/Impact), and Topic 9 (Politics), account for about 40% in the corpus, against much lesser proportions of similar image categories in the U.S. and the U.K. The proportion of Science Communication (Topic 6) in Quora answers is 11.4%, much smaller than the U.S.'s naysayer category (23% in 2010) [50]. Meanwhile, scientific research on climate change-particularly with technical details of climate modeling-has a topic proportion of 8.0% in Quora answers, but seldom appears in citizens' images related to climate change. To sum up, in such a knowledge sharing and social networking platform as Quora, users are more likely to talk about the energy, human and societal issues, and scientific research rather than natural phenomena on climate change, compared with the citizens' responses to open-ended surveys in previous research.
The regression modeling results quantitatively reveal the effects of different features on the public preferences for an answer. In terms of textual features, only four topics, including Topic 2 (Livelihood), Topic 5 (Atmosphere/Weather), Topic 7 (Hydrosphere), and Topic 8 (Future), had negative effects on the number of up-votes, and only Topic 2's and Topic 8's effects were significant. A possible reason to explain this is that Topic 2 and Topic 8 do not focus on specific subjects relevant to climate change, which can be inferred from their high-frequency terms shown in Table 2. Answers with a high proportion of those everyday terms cannot provide substantial knowledge to the readers. Hence, these answers can hardly get many up-votes and may even bore the readers. With regard to Topic 5 and Topic 7, although these two topics describe specific subjects related to climate change, the changes in atmosphere, weather, and hydrosphere are, to some extent, popular knowledge about climate change [2,49,50], which cannot stimulate the Quora users to vote for the corresponding answers. Nevertheless, the estimated coefficients of Topic 5 and Topic 7 were very small in absolute value and their effects were also insignificant, showing that the prevalence of these two topics cannot significantly influence the voting behavior of Quora users.
Topics with significantly positive effects on the number of up-votes all discuss specific subjects related to climate change. The largest effect came from Topic 6 (Science Communication), with a β value of 0.451. It is not strange that the topic of science communication could attract more support from users in Quora, as Quora does operate as a platform for online science communication. As reported by Alexa, users in Quora are more educated than the general internet population and may have stronger beliefs of the scientific consensus on climate change issues. Hence, the discussion of science communication-especially the criticism of the deniers and the skeptics of climate change-may substantially resonate with those Quora users [33,34] and can get more up-votes. The second largest effect was from Topic 9 (Politics), with a β value of 0.348. Climate change issues are always a significant political agenda at different levels [51]. From an international perspective, although agreement was reached on the Kyoto Protocol to the United Nations Framework Convention on Climate Change with over 183 countries' commitment by 2009, these countries may be unwilling to act unilaterally, because "in doing so they would pay the full price of abatement but gain only a fraction of the benefit" [52]. From a domestic perspective, decisions on policies to mitigate climate change are highly concerned with electoral interests, national discourses, and domestic political institutions [52]. The subtle linkage between climate change and politics may also be intriguing knowledge to Quora users. Other topics, including Topic 1 (Fuel/Carbon), Topic 3 (Human/Biodiversity), Topic 4 (Energy), and Topic 10 (Climate Modeling), are also specific subjects but not popular knowledge about climate change.
The effects of auxiliary features were all significantly positive. This is in line with our expectations indicated in Section 2.3.2. The remarkable feature was Author followers, which had the largest effect on the number of up-votes, highlighting the importance of social capital on science communication in such a social Q&A website as Quora [26,53]. For a knowledge contributor (answer author) in Quora, the interaction between their social capital (represented by the number of followers) and their peer recognition (represented by the total number of up-votes they received) is complex. Based on the attention economy theory proposed by Simon [54], users' attention is a scarce resource in a social network. In order to get widespread attention from readers, knowledge contributors need both more followers and more up-votes, which are mutually reinforcing. In fact, as demonstrated in previous studies, contributors' expectation of getting more attention, including followers and positive feedbacks (up-votes), motivates the development of knowledge or information sharing websites such as YouTube [55] and Twitter [56]. Hence, in order to promote science communication in social media, an in-depth understanding of this complex interaction is necessary and needs further research.
People will selectively read and understand information in ways that reinforce their already-constructed beliefs [3,34]. Previous studies with data collected from Twitter and Facebook show that the echo chamber effect is prominent in social media discussions, especially topics related to climate change [57][58][59]. Facebook and Twitter can be regarded as pure social media sites and were originally designed for social purposes. Although there are a large number of posts about climate change on Facebook and Twitter, these posts are short, scattered, and full of personal emotions, and the echo chamber effect is significant in these posts [59]. However, Quora has unique features, including a topic-question-answer structure, real name environment, and social status stimulation (a good answer will attract more readers to follow the author; thus, the author will have a higher discourse power in the community). These features make Quora a more proper platform to rationally discuss climate change issues rather than to emotionally express personal attitudes. Thus, Quora has the ability to disrupt the echo chambers in the online environment.

Conclusions
This study demonstrated the utility of the data collected from the online social Q&A community Quora for the investigation of science communication and public opinion, specifically on the knowledge of climate change. By integrating the technologies of web crawler, topic modeling, and count data regression model, a novel analytic framework was proposed to leverage the semi-structured dataset collected from Quora. The topic modeling result indicates that Quora users are more likely to talk about the energy, human and societal issues, and scientific research rather than natural phenomena of climate change, compared with the previous open-ended surveys of citizens in English speaking countries (the U.S. and the U.K.) [49]. The regression modeling results revealed that: (i) answers with more emphasis on specific subjects, but not popular knowledge, about climate change can get significantly more up-votes; (ii) answers with more terms of daily dialogue will get significantly fewer up-votes; and (iii) answers written by an author with more followers, with a longer text, with more images, or belonging to a question with more followers, can get significantly more up-votes. These results are useful in promoting the science communication of climate change in online social Q&A communities, which implement a decentralized knowledge production mode and will be the major platform for the public discussion of controversial environmental issues in the future.
As a novel investigation with a new dataset and new methodology, this study has some limitations. First, the lack of detailed demographic information of Quora users obscures the representativeness of the sample. We should admit that the sample of this study is biased. Even for users in Quora, those following the questions and voting on the answers about climate change might only be the ones who are seriously concerned with the issues. Thus, the result of this study reflects only a fraction of the public opinion. However, since Quora has been gaining more and more users, the full view of the Climate Change topic in Quora does have significance in the research field of public opinion and science communication of climate change. Second, the question information is almost absent (only reflected by Question followers) in the regression models. This may lead to a potential loss of important information. Third, some subjectivity exists in the processes of determining the topic number in topic modeling and determining the threshold of the transformation of the textual features. Hence, further research will focus on the corresponding aspects, as follows. The completion of demographic information, including gender and age, can be implemented by image recognition of user icons [60]. The involvement of question information can be introduced by a hierarchical regression model [61], which needs further classification of the questions. The subjectivity can be reduced by using more automated topic modeling approaches, such as hierarchical Dirichlet process [62]. We believe that the proposed methodology, including the valuable Q&A data and the quantitative analytic process, is expected to be widely used in future research on science communication and public opinion about climate change, as well as more general social issues.

Appendix B
As Quora has rarely been investigated in academic research, we provide more details about this website as follows.
Quora is a Q&A site where questions are asked, answered, edited, and organized by its community of users. In addition to the Q&A function, Quora allows users to follow other users and to get an information feed from them, making Quora also operate as a social media platform. Its publisher, Quora Inc., is based in Mountain View, California. The company was co-founded by two former Facebook employees, Adam D'Angelo and Charlie Cheever, in 2009 and the website was launched on 21 June 2010.
In March 2010, Quora raised 11 million dollars in a series A funding, with Benchmark Capital as an investor. The series B funding of 50 million dollars was raised in May 2012, with Peter Thiel and Adam D'Angelo as investors. The series C funding of 80 million dollars at a 900-million-dollar valuation was raised in April 2014, with Tiger Global Management and Y Combinator as investors. The latest series D funding of 85 million dollars at a 1.8-billion-dollar valuation was raised in April 2017 with Collaborative Fund and Y Combinator as investors.
As a social Q&A website, Quora requires users to use their real names rather than an Internet pseudonym (nick name) in registration. Although strict verification is not required, nick names can be reported by other users in the community. In addition, users are encouraged to present personal photos and affiliations on their profiles in Quora. These measures intend to add credibility to answers. Some well-known people, such as Barack Obama, Hillary Clinton, and Jimmy Wales, also have accounts in Quora.

Appendix C
Example Questions and Answers about climate change in Quora. Question 1: Would it be possible to produce a machine that could reduce CO 2 into pure oxygen and carbon?
Answer: Absolutely! But it would take a great deal of energy, more than the energy you got by burning the fuel to begin with. So, it's not a practical solution. But there is a lot of research along similar lines, trying to turn CO 2 into other useful products.
This answer was written in 24 December 2015 and got 2 up-votes by 4 May 2018. Question 2: How long does carbon dioxide (CO 2 ) stay in the atmosphere? Answer: This is a tricky question because of how the carbon cycle works.

1.
Any individual molecule of CO 2 may cycle in or out of the atmosphere relatively frequently. Vegetation and phytoplankton take in huge amounts of CO 2 every year, and release much of that CO 2 back to the atmosphere the same year. That CO 2 hasn't actually been "removed" from the atmosphere, it has only gone for a short trip and then returns.

2.
The key issue from a climate change perspective is how long does it take for the large amount of added CO 2 to leave the atmosphere for good. That actually takes place gradually over hundreds of years (before it's mostly gone). That's because the carbon has to find its way into permanent repositories of carbon that aren't part of the annual carbon cycle. This could be into the deep ocean, or it could be mineralized . . . there are a number of ways CO 2 leaves the "daily and yearly" cycle. But it does take a long time. That's why people are concerned that climate change is effectively permanent, at least on the scale of a couple hundred years once we've made the leap.
This answer was written in 7 December 2015 and got 19 up-votes by 4 May 2018. Question 3: What are the main results of COP 21? Answer: Some interesting things happened on the way to the COP21 agreement: 1.
All references to international aviation and maritime emissions disappeared from the final draft, apparently cutting these sectors loose. It's quite a switch from years of efforts to pull these rapidly growing emissions into the global framework of an agreement. And they were in there in most of the drafts circulated up to the very end.

2.
All references to black carbon and short-lived forcers have apparently disappeared. I suspect people will argue they're included implicitly, but there has been such a big focus recently on trying to get countries to think about "quick action" measures on things like black carbon (which have a short-term greenhouse gas potential thousands of times greater than CO 2 ) and methane that it's surprising there is no reference.