Using an Evidence-Based Approach for Policy-Making Based on Big Data Analysis and Applying Detection Techniques on Twitter

: Evidence-based policy seeks to use evidence in public policy in a systematic way in a bid to improve decision-making quality. Evidence-based policy cannot work properly and achieve the expected results without accurate, appropriate, and sufﬁcient evidence. Given the prevalence of social media and intense user engagement, the question to ask is whether the data on social media can be used as evidence in the policy-making process. The question gives rise to the debate on what characteristics of data should be considered as evidence. Despite the numerous research studies carried out on social media analysis or policy-making, this domain has not been dealt with through an “evidence detection” lens. Thus, this study addresses the gap in the literature on how to analyze the big text data produced by social media and how to use it for policy-making based on evidence detection. The present paper seeks to ﬁll the gap by developing and offering a model that can help policy-makers to distinguish “evidence” from “non-evidence”. To do so, in the ﬁrst phase of the study, the researchers elicited the characteristics of the “evidence” by conducting a thematic analysis of semi-structured interviews with experts and policy-makers. In the second phase, the developed model was tested against 6-month data elicited from Twitter accounts. The experimental results show that the evidence detection model performed better with decision tree (DT) than the other algorithms. Decision tree (DT) outperformed the other algorithms by an 85.9% accuracy score. This study shows how the model managed to fulﬁll the aim of the present study, which was detecting Twitter posts that can be used as evidence. This study contributes to the body of knowledge by exploring novel models of text processing and offering an efﬁcient method for analyzing big text data. The practical implication of the study also lies in its efﬁciency and ease of use, which offers the required evidence for policy-makers.


Introduction
Social media play a special role in our daily lives.These platforms, which are based on the ideology and technology of Web 2 [1], have transformed the way people communicate and interact [2].They work based on user participation in content generation and have led to the emergence of active and dynamic users instead of passive citizens, so that users are more engaged in diverse social developments [3].In the meantime, such media are increasingly used to provide feedback on various social issues.In fact, with the emergence of participatory social media, a new ecosystem has come to light that facilitates citizen participation in social events [4].Sharing content by citizens can give rise to social values, such as building a social discourse or raising awareness on political and economic issues [5].
The use of values created in this environment by the governments can lead to the formation of good governance in all countries, especially developing countries [6].
In the wake of the platform revolution, the role of policy-makers has changed.They have taken on new roles, including actively analyzing and extracting knowledge from the opinions of citizens using digital platforms [7].One of the best ways to identify and understand citizens' views in a bid to take them into account in the policy-making process is using social media platforms [8,9].Policy-makers need to be aware of citizens' opinions expressed in the form of social media posts, as they are published through minimum gatekeeping [10].This creates an exceptional chance for policy-makers to interact more effectively with the citizens, learn about their needs and opinions, and take them into account in the policy-making process [11].In fact, social media can serve as a channel between users and policy-makers and can be used as a new source for engaging citizens in formulating and implementing policies [12].This also creates a great opportunity for governments to learn about their citizens' views and communicate effectively with them.Social media analysis has drawn more attention in recent years, with governments seeking to take advantage of this opportunity to boost user participation on social media [13].
Social media data analysis is not new to the literature, and researchers and experts have already analyzed various datasets in different countries through diverse methods and techniques over the past few years [14,15].However, the application of these data and their analysis in the field of policy-making is a novel concept and there are many gaps and challenges that should be addressed [12][13][14][15][16]. Policy-makers have started using social media data in various sectors, including education [17], health [18], and communications [10].Analyzing these data can contribute to improving the performance of governments, boosting the quality of services, creating new and developed forms of interaction with citizens, and promoting the welfare of citizens [3].Doing so can not only help governments improve their decision-making and governance, but also revolutionize the creation and provision of such services [19,20].
Based on the evidence-based policy approach, the policy-maker is obliged to use various types of evidence.Traditionally, the basis of "evidence" is the knowledge elicited from applied research, statistics, surveys and focus groups, and so on.However, now, policy-makers cannot ignore the evidence on various issues that is being widely produced by citizens on social media [21,22].Since the question of how data retrieved from social media can be used in the policy-making process has not been answered yet [12], finding a response is of great significance.Even though various studies have been conducted on social media analysis and policy-making in recent years [20,22,23], social media have not been viewed as a source of much-needed evidence in the policy-making process.The significance of social media data used as evidence in the policy-making process is undisputed [24,25], but the unanswered problem is how to organize a large body of scattered data and use them as evidence in the policy-making process.Therefore, policy-makers need to distinguish "evidence" from "non-evidence" among a plethora of social media posts.There is an indication that Iranian policy-makers welcome the use of Twitter analytic tools, including monitoring dashboards, to obtain insights into diverse areas, including public health, education, technology, etc.To date, however, the Iranian technology policymakers have not systematically used social media data in the policy-making process, despite the incontrovertible significance of such data, which can reflect the opinions and feedback of the users.This is set against the backdrop of the fact that the absence of social media users' views who are already sensitive to policy issues can challenge the formulation, implementation, and evaluation of policies in the future.To this end, the present study sought to provide a model for detecting evidence in Twitter posts.After being designed and tested, a model was proposed to be used during the process of technology policy-making in Iran.
In the first phase of this study, the authors drafted the characteristics of the tweets to be considered as evidence in the field of technology by reviewing the related literature and also by conducting semi-structured interviews with technology policy experts in Iran.By evidence, the authors mean tweets that are relevant, critical, and representative of the facts on the ground, which are capable of creating a criterion for the policy-makers to base their decisions on an evidence-based, rather than an intuition-based, approach.By doing so, the evidence-based approach can help reduce uncertainties and bridge the gap between speculations and facts.In the second phase, the researchers collected the 6-month data from Twitter accounts that pertained to the field of technology and labeled them as evidence and non-evidence.Next, they developed a model, which was based on the six-month Twitter data, to distinguish evidence from non-evidence.
In the following sections, first, the related background studies will be reviewed, then the steps of conducting the research study will be explained, and finally, its applications will be discussed.

Background
Technology policy was chosen as a controversial area, given the challenges facing its implementation and the public outcry it occasionally triggers.As a developing country, Iran has been facing complexities in terms of emerging technology policy.The multiplicity of formal and informal policy-making institutions, high conflict of interest among policy stakeholders, lack of transparency in policy formulation and implementation, insufficient knowledge of the technologies, the uncertainty of social, economic, and cultural effects of these technologies, and the complications of the economic environment are among such complexities [26,27].Needless to say, the policies related to technology in Iran, as in many countries, are limited and too general, lagging behind the emerging technologies and preventing the policy-makers from responding appropriately to the rising problems.To reduce the potential inefficiencies, the policy-makers must be aware of the needs, interests, and desires of all stakeholders in the field of technology.
By January 2020, about 70 percent of Iranians had access to the Internet, a sharp increase of 11 percent from 2019.It is also estimated that more than 33 million Iranians (40%) use social networks.Many Iranians take to Twitter to express their critical and expert-level views on diverse social and political issues.By focusing on content rather than on the users' profiles, Twitter has created an environment where users can easily debate popular topics, with features such as hashtags and algorithmic timelines [28].This thematic arrangement of content, as well as the openness and real-timeness of Twitter [29], has enabled experts and policy-makers in various fields to use it as a source of ideas to set their policy priorities [30].
Twitter is ironically expanding its users and is estimated to have over 3.2 million users in the selected country of this study, with nearly 790 million tweets (including 200 million retweets) posted between 21 March 2021 and 20 March 2022.[31].Twitter users in Iran were estimated to be around 2 million a year ago, with 500 million tweets (including 200 million retweets) published every year.It is evident that the users are mostly from the educated strata of Iranian society, and most of the content is created by the users themselves, and unlike other platforms (such as Instagram), there is less copy and paste.Twitter has become highly politicized in Iran, and most policy-makers, public opinion leaders, and experts in various fields have official Twitter accounts.Via Twitter, Iranians from different walks of life, from ordinary citizens to the members of parliament and cabinet members, can express their views on diverse social, political, and economic issues in society.Iranians have found that launching a Twitter storm is an effective way to spread an idea, belief, demand, or protest.They already use hashtags and share them to make certain issues that have been underreported by the mainstream media trending on Twitter, as part of a strategy to grant themselves a voice [32].It can be argued that Twitter data, as a representation of public opinions, can influence the policy-making agenda.
Nowadays, policy-makers actively take advantage of social media networks to reach a wider audience, raise awareness of the issues that matter to them, promote their views, mobilize supporters, and receive timely feedback [33].Given the prevalence of Twitter among Iranian intellectual figures, policy-makers have already kept an eye on the content produced and spread on the platform.If detected and analyzed, such data can be translated into relevant evidence that is much needed by the policy-makers in emerging areas, including technology policy.

Evidence-Based Policy
Evidence-based policy is defined as the systematic use of evidence in formulating public policies.It is an approach that has its roots in evidence-based medicine and treatment [34,35].Evidence-based policy became pervasive once the modernization of governments rose to the top of the agenda of countries and the tendency for policy-making based on social science analysis increased [35,36].Policy-makers and government officials sought to respond to the needs and pressures of citizens for whom social services were inadequate or unsuitable.Frustrated with the low efficiency of policies used in the social programs, the policy-makers and government managers were prompted to seek new policy approaches [37].Policies based on old ideologies no longer worked, and the policy-makers welcomed more modern approaches [27].These new orientations were coupled with the tendencies of major NGOs, guilds and other stakeholders in the public space to become involved in solving the existent problems [38].The awakening of policy-makers opened up an opportunity for the public policy field to offer solutions in the form of new approaches to gain more control over the ambiguous and confusing realities of the policy-making environment.Policy-makers adopted an evidence-based policy approach as part of an effort to solve their problems and increase policy efficiency.This new approach is the result of the emergence of the inefficiencies of government policy-making and its implementation in various fields [8,39] and aims to develop policy alternatives to boost the quality of decisions made by policy-makers.
The evidence-based policy approach is based on two contexts without which proper enforcement of this policy-making approach is not possible.The first context is a favorable political culture that allows the inclusion of transparency and rationality in the policy-making process.The second includes a research culture committed to analytical studies using rigorous scientific methods that generate a wide range of evidence for policymaking [40].Sufficient information is one of the prerequisites of a good policy [29][30][31][32][33][34][35][36][37][38][39][40][41].Policy-making requires a variety of evidence in complex, variable, and high-risk policy areas.Therefore, generating up-to-date, multidimensional, and multilevel evidence can further help policy-makers in this field [42].There are several reasons why the traditional evidence collected through applied research, which is available to the policy-makers, is not sufficient for effective policy-making.First, it is not possible to obtain accurate findings on key issues in many areas through research.Second, policy-makers and politicians are often influenced and motivated by many factors other than research evidence [41][42][43].Therefore, it can be concluded that the availability of reliable research results alone does not guarantee its effectiveness.Valid, extensive, and multidimensional evidence is largely missing in evidence-based policy.Producing evidence with these characteristics demands consistent and long-term efforts by policy-makers and researchers.

What Is the Evidence?
An evidence-based policy cannot work properly and yield expected results without accurate, sufficient, and appropriate evidence.Searching for accurate and reliable evidence and efficiently using it in the policy-making process is viewed as one of the underlying principles of the evidence-based policy approach [39][40][41][42][43][44].The following three types of evidence are important in policy-making: political evidence, analytical and technical evidence, and evidence collected from the field and through professional experience [24].In fact, these types of evidence offer three kinds of lenses for policy-making.Each is produced with respective knowledge, and professional and political protocols and is subject to policy-makers' interpretations, limitations, and constraints [45].
In evidence-based policy, the scope of evidence must be expanded to include all types of evidence [40][41][42][43][44][45][46].Traditionally, the basis of "evidence" as the foundation of evidencebased policy is the knowledge produced via applied research on comprehensive trends and the explanation of social and organizational phenomena [46].Based on the type of policy issue and the ability and capabilities of policy-makers in data collection [34][35][36][37][38][39][40][41][42][43][44][45][46][47], governments need to have a certain level of evidence analysis capabilities for policy-making.In practice, however, such a capability does not always exist, and governments often fail to systematically analyze a large body of formal and informal evidence and incorporate it into the policy-making process [48,49].
From an evidence-based policy perspective, the question is as follows: what kind of data/information is needed to produce evidence?Some policy researchers and analysts have started asking the question of whether the persistence of complex social problems is due to the lack of data available to policy-makers [50,51].However, now, the evidence shows that obtaining more data to fill the gaps will not necessarily lead us to good policy solutions because the most important step to take in policy-making is to reconcile different values and political views with scientific evidence within a policy-making system that is capable of approximating the evidence elicited from the opinions of stakeholders.
Hence, there is disagreement about what kind and quality of evidence can help improve policies.In a broader sense, it can be argued that there is no single basis to define what evidence is [27].All types of different evidence must be used in a more inclusive context [37], that is, formal and informal evidence needs to be integrated to meet policy needs.

Challenges of Using Evidence in Policy-Making
Quantitative data and empirical methods have been used as a tool to provide accurate and reliable evidence for policy-makers for many years.Over the years, the focus has been on the advantages, such as the high accuracy of these methods in producing quantitative and analytical evidence [43][44][45][46][47][48].However, there have always been three main challenges confronting the use of this evidence in policy-making.The first stems from the inherently political and value-oriented nature of policy and decision-making [45], which does not allow the use of this type of evidence in policy-making.The second is related to the fact that evidence is produced in different ways by different actors who look at the phenomena through different lenses.The third challenge is the complicated policy network that has made evidence difficult to use.Policy network actors interpret, understand, and prioritize evidence in different ways based on different experiences and values [45].In the real world, policies do not follow analytical and empirical evidence but are rather based on judgments, values, and so on.Policy-making is a vague, politicized process, and is sometimes contradictory to the original paradigms [52], and requires compromise, adjustment, and agreement that make it difficult to use evidence.
A crisis or emergency, political priorities, and changing social values in public opinion are among the cases where governments face difficulty in using evidence [27,49,53,54].These are some of the complicated issues faced by policy-makers who will be entangled with problems in producing appropriate and multidimensional evidence, while pursuing the evidence-based policy approach.Instances of such policies can be found in areas related to moral and ethical issues.These are some of the major problems threatening evidence producers, which have validated the use of different types of evidence-based methods on different foundations in the policy-making process to meet the aforementioned challenges.

Evidence from Public Engagement
In public environment in which government policies are implemented in the public sphere, or more specifically the public opinion sphere, all the policy stakeholders are present.This environment reacts to different policies and the success or failure of policies is determined in this sphere.The policy-makers must respond to this complex environment in which stakeholders are present and encourage them to implement the policies.There is an emphasis on the need to collect and produce evidence based on the engagement of the public and all policy stakeholders, which is viewed as the basis for producing multidimensional and comprehensive evidence [54].Engaging all stakeholders in a policy is often considered as a part of the policy-makers' long-term relationship with them, as well as the development of public engagement capabilities [55,56].
Involving citizens' digital participation in policies and employing it in the policymaking process in the form of evidence is already underway.Policy-makers need to use citizens' digital interaction data to learn about public views and the way public discourses are formed, who the main stakeholders are, and what their expert groups and communications are like, which are growing on a daily basis [55][56][57].We are now witnessing an increase in the use of new technologies to collect and analyze citizens' online participation data [58,59].These tools operate as a complement to the conventional data collection and analysis methods to generate evidence.This adds to the questions about how these digital audiences can provide useful evidence for policy-makers.

Social Media Data as Evidence in the Policy-Making Process
Social media data offer a good representation of the big data of any country, opening up many opportunities to raise awareness about the demands of citizens.Social media data enable accurate predictions, create knowledge, bring about new services, and can lead to providing citizens with better facilities [10,16,60].Using social media data, policy-makers can reduce policy costs and achieve sustainable development in a variety of areas.In addition, social media data can enhance policy transparency, increase the credibility of governments and policy-makers, boost government oversight performance, and narrow the gap between government oversight and the realities in society [61].Such characteristics of social media data can enable policy-makers to boost the oversight of citizens' digital interactions [12].In addition, the ideal combination of social media data with policy-makers' background knowledge will result in more relevant information.Moreover, social media can generate large amounts of data, in contrast to the traditional sources of evidence, and can garner novel insights into how stakeholders think about policy issues.The value of social media data as part of the policy-making cycle and using evidence collected from social media is highly practical; therefore, it is not possible to ignore its significance [62].Using social media data in policy-making is not a new issue, as various tools have already been developed for collecting, analyzing, and visualizing social media content almost on a daily basis [60][61][62][63].
Recent research has focused on combining social media data with the policy-making process and how these resources can enable policy-makers to obtain fresh insights.Fernandez et al. [64] employed the citizen participation (CP) framework and linked it to the digital participation created through social media.Panayiotopoulos, Bowen, and Brooker [13] used the crowd capabilities conceptual framework to underline the value of social media data for policy-makers in the policy-making process.They question how policy-makers understand the value of social media data and which of the collective capabilities of social media data can meet the needs of policy-makers for filtering data input in the policymaking process.Edom, Hwang, and Kim [5] examined the roles that communities formed on social media can assume in promoting the social responsibility of governments.They presented a typology that features government communication on social media in which social media data become a new resource for policy-makers and increases communication between policy-makers and citizens.Gintova [63] believes that few studies have shed light on the behaviors and views of users on state social media so far.In her study, she analyzed the experiences of social media users and the way they interact on the Twitter and Facebook pages run by a Canadian federal government agency.Driss, Mellouni, and Trabelsi [12] provide a conceptual framework for employing data generated by citizens on Facebook in policy-making.According to their research results, social media data can be used in two phases of the policy-making cycle, i.e., the definition of the problem and policy evaluation.Napoli [64] argues that social media, based on the framework of "public resources", are a public resource that can and should be used to promote public interests.He believes that there is a positive correlation between extracting and analyzing social media data and improving public life [53] investigated the impact of social media on citizens' participation in events that needed policy intervention.They concluded that more access and activity on social media result in better participation in such affairs.Lee, Lee, and Choi [29] examined the impact of politicians' Twitter communications on advancing their policies, assuming that politicians around the world are increasingly using social media as a channel of direct communication with the public.The study results indicated that communication breakdowns on social media by politicians would have the greatest impact on policy implementation during the process of policy-making, causing the users to distrust the enforcement of the policies.Simonowski, [16] present a framework for analyzing social media data and how to use them in policy-making.They integrate data collected from users' digital participation on electronic platforms and social media, employing them at various stages of the policy-making process.
Previous research works into social media analysis and policy-making already corroborate the importance of using social media data in policy-making.However, some studies have questioned the use of such data as evidence in policy-making [12,41,58,[63][64][65].

Proposed Model for Evidence Detection
One of the major objectives that the present research seeks to realize is the identification of the characteristics that the Twitter posts need to possess to be considered as policy evidence.Since labeled data (evidence/non-evidence) were required to train the evidence detection model, the researchers carried out in-depth, semi-structured interviews with the Iranian technology policy practitioners to elicit the characteristics.The experts were selected based on their expertise and several years of practical experience related to technology policy-making.The interviews began with general questions about the effectiveness of Twitter in policy-making and proceeded based on the statements made by the interviewees.Prior to the interview, an interview guide, which contained a series of open-ended questions aimed at further preparing the interviewees, was emailed to them.The interviews were transcribed and analyzed via the thematic analysis technique to extract the characteristics of tweets that were deemed as tech-related policy evidence by the policy-makers, based on which the tweets were labeled as evidence or non-evidence in the next phase of the study.Some challenges arose while conducting the interviews with the experts and policy-makers about identifying the characteristics of the so-called evidence tweets.The definition of what was referred to as "evidence" was a subjective concept and lacked an objective criterion.In addition, some of the definitions had a broader scope and included professional experience, political knowledge, ideas of stakeholders, etc., while some others had a narrower definition of evidence based on statistical comparisons.In aggregate, 96 themes, 32 sub-themes, and 15 concepts (characteristics) were extracted from 480 comments and meaningful sentences.Table 1 lists the characteristics of the tweets considered as evidence by the interviewees.
To detect the evidence tweets, all tweets had to be labeled as evidence or non-evidence, based on the characteristics extracted from interviews with technology policy-making experts, which were conducted earlier.The tweets were divided into two classes, evidence and non-evidence-based, using the above-mentioned characteristics during the labeling process.

Data Set and Feature Engineering
In order to develop an evidence detection model, as one of the other major objectives of this research, the following steps were carried out.

Data Collection
All of the data used in the present study were collected from Persian tweets posted over six months using the Twitter API tool, a data extraction tool employed by the developers.
First, 39 keywords related to "technology policy in Iran" were selected by reviewing the literature in Persian [59][60][61][62][63][64][65][66].The tweets were searched and collected based on the selected keywords.As there were few tweets for some keywords, the authors decided to remove them from the list and place a greater focus on other more frequently used keywords.There were also some other relevant keywords commonly used by the users that did not exist in the technology policy-making jargon.The authors decided to include these keywords with the aim of collecting more tweets related to technology policy-making.They searched and collected tweets that contained the hashtags of these keywords.Based on the selected keywords, 28,277 tweets were initially collected.Nearly half of the tweets, which included duplicate tweets, retweets, spam, advertisements, or irrelevant tweets, were removed from the dataset during the pre-processing phase, leaving 14,029 tweets.The topic has political priorities for the policy-maker 13 The urgency of the topic mentioned in a tweet 14 Reveal corruption in technology 15 Provide analytic and technical knowledge relevant to technology

Feature Extraction
After the pre-processing step, the appropriate features were extracted.In the previous research on social network user behaviors, account-based and text-based features were employed to analyze user data, implying the successful use of these features in analyzing Twitter posts.Many studies on social media analysis [56][57][58][59][60][61][62][63][64][65][66][67] have already employed account-based features to evaluate user profiles and text-based features to identify the behavioral patterns used in the text.As far as the authors are concerned, these features have not been used in any of the studies on policy-making to distinguish evidence from non-evidence.Therefore, given the successful use of these features in various studies, the authors decided to employ the selected features in order to distinguish evidence from nonevidence items.The research study also aimed to extract new features that can contribute to detecting evidence posts.Accordingly, both text-based and account-based features were used to distinguish evidence posts from non-evidence posts.Table 2 shows the text-based features used in the study.
Table 3 lists account-based features that showcase the characteristics of the accounts.

Feature Selection
In this phase, to select the best subset of the features, the information gain metric was used to determine the value of each feature.Since the most effective feature for classification is the one that decreases entropy, the information gain metric is used for measuring the amount of entropy decline.The information gain is calculated via the following formula:  The Table 4 shows the gain values of each feature.Some features extracted via the evidence detection model are more important for the model, while others are less important.
In calculating the information gain of each feature, the value of some features was zero, which may be because the values of those features were the same for both evidence and non-evidence labels.For example, all the user accounts in the dataset had descriptions and profile pictures in their accounts.It was also made clear that the time of posting tweets (around the clock) was almost equally distributed between the evidence and non-evidence categories, so features such as the time of posting the tweets were not considered an important feature in training the algorithm.
Taking into account the values of the information gains of the features, different subsets of data were examined to implement the classification model.Accordingly, features 1 to 25 were chosen as the best subset of features in this study, which can train the model to detect evidence tweets with a higher degree of accuracy.This subset included 20 text-based features and 5 account-based features.The Figure 1 shows the information gained from each feature.Taking into account the values of the information gains of the features, different subsets of data were examined to implement the classification model.Accordingly, features 1 to 25 were chosen as the best subset of features in this study, which can train the model to detect evidence tweets with a higher degree of accuracy.This subset included 20 textbased features and 5 account-based features.The Figure 1 shows the information gained from each feature.

Proposed Classification Approach
Machine learning algorithms have been used to distinguish evidence tweets from non-evidence tweets.Since we aim to achieve specific outputs for samples via evidence detection, our approach is based on supervised learning problems and classification techniques.The steps for implementing our proposed model are presented in the framework (Figure 2).There are two main processes in the supervised learning approach, which include training the algorithm and testing the trained algorithm.The data that are used for this purpose have to be prepared based on the extracted feature set for each sample, for example, S is shown as S = {f1, f2, f3,…, fn} and labeled as L. Therefore, each instance in the dataset is shown as Data = {(S1, L1), (S2, L2), (S3, L3), …, (Sn, Ln)}.Then, to train the machine learning (ML) algorithms, the dataset is split into two parts, including train and test sets.In order to train the ML algorithm, the whole samples in the larger part of the data (train set), along with their labels, are given to the ML algorithm (classifier) created in the format of the data above, and then the second part is used for testing it.
The test set consists of samples that the ML algorithm has never encountered before.The whole samples in the test set without their labels are given to the trained classifier in

Proposed Classification Approach
Machine learning algorithms have been used to distinguish evidence tweets from non-evidence tweets.Since we aim to achieve specific outputs for samples via evidence detection, our approach is based on supervised learning problems and classification techniques.The steps for implementing our proposed model are presented in the framework (Figure 2).There are two main processes in the supervised learning approach, which include training the algorithm and testing the trained algorithm.The data that are used for this purpose have to be prepared based on the extracted feature set for each sample, for example, S is shown as S = {f1, f2, f3, . . ., fn} and labeled as L. Therefore, each instance in the dataset is shown as Data = {(S1, L1), (S2, L2), (S3, L3), . . ., (Sn, Ln)}.Then, to train the machine learning (ML) algorithms, the dataset is split into two parts, including train and test sets.In order to train the ML algorithm, the whole samples in the larger part of the data (train set), along with their labels, are given to the ML algorithm (classifier) created in the format of the data above, and then the second part is used for testing it.

Evaluation Metrics
To examine the performance of the evidence detection model, we used the metrics widely used by researchers in classification problems.After comparing the labels predicted by the classifiers with the actual label of the samples, the results were grouped into the following four categories: The above classifier definitions are displayed via the confusion matrix in the Table 5.The test set consists of samples that the ML algorithm has never encountered before.The whole samples in the test set without their labels are given to the trained classifier in the format of Test = {S1, S2, . . ., Sn}.Then, the trained classifier predicts and assigns the label of each sample to it.Finally, the predicted class of each sample is compared with its original label.A decision tree is already commonly used as an inductive inference algorithm to solve classification problems and develop prediction models.Decision trees are known as the most popular class of function, as Chapelle and Chang (2011) reported in their study.The decision tree is known as one of the most widely used classification algorithms in supervised learning problems.Due to its simplicity in interpretation and high power in classifying classes, this algorithm is also widely employed as one of the most powerful algorithms in classification problems.The structure of a decision tree is similar to a tree and consists of roots, nodes, and leaves.The best feature is located at the root and each node is compared by the feature values.The leaves of the tree represent the final class results for the samples in question.In Section 5.3 of this study, we tested several algorithms, with the decision tree showing the best performance.
The performance of each classifier can be assessed by evaluation metrics.

Evaluation Metrics
To examine the performance of the evidence detection model, we used the metrics widely used by researchers in classification problems.After comparing the labels predicted by the classifiers with the actual label of the samples, the results were grouped into the following four categories:

•
True positive (TP): tweets that belong to class Evidence (E) and are correctly predicted as class E.

•
False positive (FP): tweets that do not belong to class E and are incorrectly predicted as class E.

•
True negative (TN): tweets that do not belong to class E and are correctly predicted as class non-E.

•
False negative (FN): tweets that belong to class E and are incorrectly predicted as class non-E.
The above classifier definitions are displayed via the confusion matrix in the Table 5.The performance of a classifier can be evaluated by accuracy, precision, recall, and F-measure metrics.Precision, recall, and F-measure metrics are calculated using confusion matrix values according to the following formulas: Recall is defined as the number of correct results divided by the number of results that should have been returned, while precision is how closely the measurements are gathered around a point estimation.The F score is the harmonic mean (average) of the precision and recall.Accuracy is one of the most common metrics in evaluating classifiers' performance.This metric is calculated using the following formula: Accuracy = (TP + TN)/(TP + TN + FP + FN)

Experiments and Results
In this section, first, the significance of the topics from the users' viewpoints based on the number of tweets that contain the keywords is reviewed, and then the users' behaviors are analyzed, and finally, the proposed evidence detection model is assessed.

Statistical Report
The Figure 3 shows the number of extracted tweets.Among the topics related to technology, "filtering" with 8817 tweets and "#filtering" with 1248 tweets were the most frequently posted tweets.This means that they were considered the most important topics from the perspective of users.Tweets that contained the keywords of "Personal Information", "Information Disclosure", "Access to Information" and "Free Access" were viewed respectively as the other most important topics from the users' point of view.
Figure 4 illustrates the most frequently used topics by the users in a word cloud.The word cloud is sorted out according to the number of tweets that contained the related keywords, indicating the significance of various topics in technology policy.They are indicative of the topics that attracted the attention of the users the most in the abovementioned period.
The Figure 3 shows the number of extracted tweets.Among the topics related to technology, "filtering" with 8817 tweets and "#filtering" with 1248 tweets were the most frequently posted tweets.This means that they were considered the most important topics from the perspective of users.Tweets that contained the keywords of "Personal Information", "Information Disclosure", "Access to Information" and "Free Access" were viewed respectively as the other most important topics from the users' point of view.

Analyzing the Behavior of the Users Posting Evidence Tweets
This section examines the behaviors of the users that posted the evidence tweets.To implement the proposed evidence detection model, first, all of the extracted tweets, regardless of the keywords, were categorized into an integrated data set from which 4560 tweets were randomly selected.As the figure below shows, the evidence generated by Twitter users has its own characteristics.These features are shown in Figure 5.According to Figure 5a, most of the evidence tweets did not contain a hashtag, and very few had only one hashtag.This indicates that the texts that can be considered as evidence are often not hashtagged by the users.Figure 5b shows that most of the evidence tweets lacked emojis.This means that users whose tweets can be considered as evidence mostly do not use emojis in their tweets.Not using emojis in most of the evidence tweets is indicative of the formal setting in which these tweets were posted.
Figure 5c also shows that most evidence-generating users are reluctant to use more than one mention in their tweets, which seems to indicate that the users do not intend to post personal comments or reply to another specific tweet.The users do not also intend to address other users or attract their attention.Figure 5d indicates the absence of URL links in more than 87% of the evidence tweets, with the remaining 13% of tweets containing only one URL link.This indicates that giving references to sources outside Twitter is not very common as far as generating evidence is concerned, as most of the evidence is generated and published within the platform.Figure 5e also shows that evidence-generating users often tend to express their opinions within 1-3 sentences.This also shows the brevity of the evidence tweets.
In Figure 5f, it can be observed that a larger number of characters (mostly over 200 characters) have been used in non-evidence tweets compared to evidence tweets.Simi-

Analyzing the Behavior of the Users Posting Evidence Tweets
This section examines the behaviors of the users that posted the evidence tweets.To implement the proposed evidence detection model, first, all of the extracted tweets, regardless of the keywords, were categorized into an integrated data set from which 4560 tweets were randomly selected.As the figure below shows, the evidence generated by Twitter users has its own characteristics.These features are shown in Figure 5.According to Figure 5a, most of the evidence tweets did not contain a hashtag, and very few had only one hashtag.This indicates that the texts that can be considered as evidence are often not hashtagged by the users.Figure 5b shows that most of the evidence tweets lacked emojis.This means that users whose tweets can be considered as evidence mostly do not use emojis in their tweets.Not using emojis in most of the evidence tweets is indicative of the formal setting in which these tweets were posted.
Figure 5c also shows that most evidence-generating users are reluctant to use more than one mention in their tweets, which seems to indicate that the users do not intend to post personal comments or reply to another specific tweet.The users do not also intend to address other users or attract their attention.Figure 5d indicates the absence of URL links in more than 87% of the evidence tweets, with the remaining 13% of tweets containing only one URL link.This indicates that giving references to sources outside Twitter is not very common as far as generating evidence is concerned, as most of the evidence is generated and published within the platform.Figure 5e also shows that evidence-generating users often tend to express their opinions within 1-3 sentences.This also shows the brevity of the evidence tweets.

Evaluating Proposed Model
To evaluate the proposed model, the authors used 4560 tweets, which were div into 2 groups of evidence and non-evidence tweets.In this dataset, 2978 samples w labeled as evidence and 1583 samples as non-evidence.Thus, the data set was divided two segments to train and evaluate the proposed model, the train set (including 4104 s ples) and the test set (including 254 non-evidence and 202 evidence samples).In this to detect evidence tweets, six classifiers, including support vector machine (SVM), d sion tree (DT), K-nearest neighbor (KNN), linear discriminant analysis (LDA), X-GBo and logistic regression (LR) were used [68] and their performances were assessed.F the classifier models were implemented in the train set, and after the learning process trained classifiers were implemented in the test set.Table 6 shows the performance res of each classifier based on the evaluation metrics for each algorithm.In Figure 5f, it can be observed that a larger number of characters (mostly over 200 characters) have been used in non-evidence tweets compared to evidence tweets.Similarly, according to Figure 5g, the users are more inclined to use single-line texts, showing no tendency to post long tweets or threads.Figure 5h shows that the average number of words used in both groups is almost the same (around 4).According to Figure 5i, nearly 70% of the users used up to 40 spaces in the text of the evidence tweets to express their views, with 30% of those generating evidence tweets using spaces beyond that range.Moreover, the patterns extracted from user behavior analysis suggest a less frequent use of punctuation marks, less frequent use of words exceeding five letters, and limited use of words exceeding three letters in evidence tweets in the last three charts.
The figures below also show the analysis of the behavior patterns extracted from account-related features (followers, followings, and lists).It was found that 75% of the users who posted evidence tweets published less than 17,000 tweets, as 25% of those generating evidence had posted more than 17,000 tweets.However, 35% of the users who posted non-evidence tweets had less than 17,000 tweets and 65% of these people had more than 17,000 tweets.

Evaluating Proposed Model
To evaluate the proposed model, the authors used 4560 tweets, which were divided into 2 groups of evidence and non-evidence tweets.In this dataset, 2978 samples were labeled as evidence and 1583 samples as non-evidence.Thus, the data set was divided By applying the proposed model to distinguish evidence from non-evidence tweets, the policy-makers can have access to useful tweets that can be considered as evidence and learn about the users' opinions, interests, needs, and capacities.The evidence and learning materials can be considered in making appropriate policy decisions worldwide, especially in the selected case country.Furthermore, the proposed model can be used with larger datasets to improve the training of the model and increase its accuracy.

Discussion
This paper contributes to the current body of knowledge by offering an evidence detection model to facilitate the use of data collected from Twitter, an influential social media platform used by policy-makers.This technology supports policy-makers by offering an evidence-based policy-making approach.The policy-makers can use the proposed evidence detection model to learn about the users' views, interests, needs, and capacities and make proper policy decisions accordingly.This model can be designed as a dashboard to be used by technology policy-makers.
The development of policy evidence from formal evidence to informal evidence elicited from user-generated data on Twitter is the first theoretical contribution of this research.However, this should be treated with caution, as tweets cannot be deemed as scientific evidence.Formalizing the use of social media data as evidence in the evidence-based policy approach is the second contribution of the study.To this end, features that are capable of turning user-generated social media data into evidence were detected.Some extracted features were general enough to be applied to other policy-making areas.The present study formalized user-generated social media data in the policy-making process by presenting a model based on several scientific fields, including policy-making (evidence-based policymaking), social media, and data science.The theoretical approach adopted in this article was initially derived from evidence-based policy.It also originated from citizens' digital participation in public policy.In this study, to the best of the authors' knowledge, for the first time, using social media data as evidence in policy-making was proposed and accordingly, a model was designed to detect evidence.The proposed model can rid the policy-makers of the daunting task of detecting evidence among a plethora of irrelevant data collected simply based on selected keywords.The model, if applied, can facilitate using such data in the evidence-based policy-making process.The authors decided to use posts published on Twitter, a social media platform in Iran with extensive user-generated content on public policies.In addition, the scope of the research was limited to technology policy, which, due to the emerging problems it deals with, needs to include citizens' views.Such policies can be applied to different communities.Moreover, this research study was conducted in a policy area in Iran where the policy-makers are reluctant to use users' opinions in the policy-making process.The model proposed in this study can facilitate this process and possibly encourage policy-makers to do so.Thus, implementing the model in similar developing countries can prepare the ground for the participation of citizens in shaping public policies.
This study addressed key challenges and the current gap in the literature by focusing on policy-making and addressing some key issues.Some of the issues raised by the researchers regarding using social media data in policy-making can be summarized as follows:

•
Lack of text-based analytical tools for detecting policy evidence from the posts on social media [12,69]; • Analytical tools developed so far are more suitable for improving the image of brands or obtaining feedback from customers about the products or services offered by companies and may fail to meet policy-making needs [13,[63][64][65][66][67][68][69][70];

•
Comments by non-expert users about specialized policy issues, in the form of social media posts, may not be included in the evidence category.Moreover, there is no specific tool to sort out such data [21,71,72]; • The big data shared on social media may have been tainted with bias, further complicating the analysis [70][71][72][73][74][75];
The research also offers a great practical tool for policy-makers to use in various contexts.Due to its acceptable level of accuracy for detecting evidence among large datasets obtained from Twitter, the model verified in the present study can be easily employed by policy-makers.It is also possible to apply it to other social media platforms in other developing countries.The authors suggest that researchers should use the evidence detection model in other communities for detecting evidence on diverse policy issues about which the citizens are sensitive to producing content on social media to provide the policy-makers with evidence elicited from the public opinions on social media.This model enables the policy-makers to add a different, but very important, channel to their evidence-gathering channels.

Conclusions
The large amount of data shared on social media is considered as one of the obstacles to using this information in the policy-making process.The main reason for this is the complexity and challenging process of analyzing big data created by various users over an extensive time period.This paper facilitated the acceptance of a novel approach to analyzing the data by reducing the size of the data and applying deep learning models.Using evidence in policy-making can facilitate the process, obviating the tremendous task of detecting evidence among a large number of posts on social media.The present study contributes to the efforts to develop a model that can help policy-makers to distinguish evidence from non-evidence.
This study offers innovative contributions at several levels.First, the investigation identified features of user-generated content by using a machine learning approach that can be converted into policy evidence.Such features had not been identified in the previous and relevant literature in the field.Despite being limited to a specific policy area (technology), many of the features are general enough to be used in detecting evidence posts in other areas as well.These features can also be developed based on the selected social media platform and policy area.Second, this study integrated data analysis techniques and applied them to develop a model.Social media data analysis methods have not been used for detecting evidence.This proposed model is limited to tested data and a specific policy area, which is technology in Iran.Given the significance of the first-hand evidence elicited from the end users' feedback in technology policy-making, this research can serve as a role model for the implementation of similar studies in other communities and policy areas.Moreover, the features based on which the model was developed are of a global nature and can be used in similar cases across the world.The present research, for the first time, proposed a model that distinguishes evidence posts from non-evidence posts on Twitter.It can underpin the decisions made by the policy-makers by providing social media users' views on different issues.Given the large amount of data on social media, including users' views and comments, it is not possible to use all the data in the policy-making process.The proposed model is capable of detecting evidence posts and supplying the policy-makers with suitable data and encouraging them to use the data in the policy-making process.
However, distinguishing evidence from non-evidence in social media data cannot be considered the only way to improve evidence-based policy.This is especially the case with complex problems, which can only be solved through different types of evidence.The present study simply aimed to develop different types of evidence and did not intend to criticize other types of evidence.Sole reliance on insights from social media analytics can even result in bipolar views and non-constructive arguments for policy-makers [72] and even mislead them, since the social media data analysis methods are not developed enough to offer comprehensive views on policy-making.However, the development of different concepts, methods, and techniques in this area can be helpful and may increase the efficiency of the policy-making process.
For future work, the authors suggest that other researchers should test the proposed model on other social media platforms to answer the question of whether the features capable of turning social media data into policy evidence on Twitter can gain similar results on other social media platforms.This study was limited to technology policy.The question that future research should address is whether diverse policy areas need different evidence detection models.

Figure 1 .
Figure 1.Values of information gain of each feature.

Figure 1 .
Figure 1.Values of information gain of each feature.


True positive (TP): tweets that belong to class Evidence (E) and are correctly predicted as class E.  False positive (FP): tweets that do not belong to class E and are incorrectly predicted as class E.  True negative (TN): tweets that do not belong to class E and are correctly predicted as class non-E. False negative (FN): tweets that belong to class E and are incorrectly predicted as class non-E.

Figure 3 .
Figure 3.The number of tweets related to each keyword.

Figure 4
Figure 4 illustrates the most frequently used topics by the users in a word cloud.The word cloud is sorted out according to the number of tweets that contained the related keywords, indicating the significance of various topics in technology policy.They are

Figure 3 .
Figure 3.The number of tweets related to each keyword.

Figure 5 .
Figure 5. Comparative charts related to the features.

Figure 5 .
Figure 5. Comparative charts related to the features.

Table 1 .
The evidence evaluation criteria in the field of technology policy from the perspective of policy-makers.

Table 2 .
Text-based features used in the study.

Table 3 .
Account-based features used in the study.

Table 4 .
Information gain of each feature.

Table 6 .
Performance evaluation comparison based on precision, recall, and F-measure (percen