2. Materials and Methods
The method used and the sample selection are described in detail in the following sections. The method used is social impact in social media (SISM) methodology [2
], which combines quantitative and qualitative content analysis of the sample selected considering the contributions of the social impact of the research [45
]. According to Elo et al. [46
], there are trustworthiness issues in the preparation phases of the data collection method, the sampling strategy and the selection of a suitable unit of analysis according to the way in which the research goals are defined. In this sense, the following sections develop in detail how the sample was selected and how the data collection and extraction were developed. The data analysis is explained in detail in the corresponding section, along with an explanation of the unit of the analysis used.
Regarding ethical considerations, the present research adheres to international ethical criteria related to social media data collection and corresponding analyses; in particular, we have followed the ethical guidelines for social media research supported by the Economic and Social Research Council (UK) [47
] and Ethics in Social and Humanities Sciences of the European Commission [48
]. Furthermore, we have perceived the risk of harm to and conserved the anonymity of users. Additionally, we have read the terms, conditions and legalities of each of the social media channels, and we have used only public information without identifying any user.
Likewise, the data were appropriately coded and anonymized to avoid the possibility of traceability. Sets of data have been secured, saved, and stored. The dataset analyzed and the calculations performed are available in the Supplementary Materials
(dataset). We cannot share all raw data due to the current terms of the social media channels and the General Data Protection Regulation (GDPR).
2.1. Sampling, Social Media Data Collection and Extraction
The first step to develop this study was the selection of a suitable sample of social media channels to collect the data. The social media channels for the analysis are Facebook, Twitter and Reddit, and their selection corresponds to three criteria: (1) relevance of the number of active users in millions according to Statista 2019 data: Facebook (2414), Twitter (330), Reddit (330); (2) availability of public messages; and (3) suitability for online discussion. There are other social platforms with millions of users, but these three have been selected because they are more suitable for our research study.
The chosen sample is exploratory and selective and is based on the following criteria.
Criteria 1: Selection of suitable searchable keywords. We selected the word “health” as a general topic and the specific keywords “vaccines”, “nutrition” and “Ebola”. The use of these specific keywords is based on the findings by Wang et al. [10
], in which the authors identify vaccines, nutrition, and Ebola as topics with more misinformation in social media. Specifically, we used the hashtags #vaccines, #nutrition and #Ebola to extract Twitter information. In relation to Facebook, we selected two public pages with more audiences in relation to the topic “health”. With regard to Reddit, we selected the topic “vaccines” in a community focused on this topic and an “AskScience Ama Series” focused on vaccines.
Criteria 2: Data extraction. The data extracted from Twitter contain tweets published under the hashtags selected in the last ten days. In the case of the Facebook page, the data are extracted from the last 100 posts published and the corresponding comments of the two Facebook pages selected. Finally, in Reddit, we selected the comments published in two conversations in two different communities (the AskScience Ama Series focused on vaccines and the vaccines subreddit). Table 1
shows the data collected.
2.2. Data Analysis
The strategy for data analysis aims to unveil the nature of interactions focused on misinformation or fake health information and the nature of interactions based on health evidence of potential or real social impacts. To do so, we have designed the following steps and strategy. The unit of the analysis is the full message published by the user (tweets, Facebook posts and Reddit messages (posts and comments)), which means that the information provided in the external links included in the messages is also analyzed.
Step 1: To identify which tweets (top 100 posts for each hashtag) and Facebook posts (top 20 posts) have received more attention. In the case of Twitter, this identification depends on likes and retweets. In the case of Facebook, this identification depends on likes on the posts of the Facebook pages selected and the public comments with more likes (top 20). In the case of Reddit, this identification depends on the total number of interactions of 4 conversations in the vaccines subreddit and the 100 most-valued comments by the community (the comments are sorted by reader preference), and each comment receives points given by different members of the community on the AskScience Ama Series. For the case of the AskScience Ama Series of one of the conversations, we selected the last 20 comments sorted by reader preference, such that each comment receives points according to the preferences of different members of the Reddit community. Table 2
shows the data selected.
Step 2: Development of qualitative content analysis for each message selected (N = 453) (tweets, Facebook posts, Facebook and Reddit comments). Researchers apply a classification of messages according to the codebook (see Table 3
) and interactions received. The social impact coverage ratio (SICOR) will be applied for each source of social media selected, which identifies “the percentage of tweets and Facebook posts providing information about potential or actual social impact in relation to the total amount of social media data found” [2
]. The elaborated codebook has four categories defined a priori as a result of the literature review performed. The categories classify messages analyzed regarding evidence of social impact, false news, misinformation, opinion, and facts. While the research team performed the first analysis with these four categories, two new categories emerged from the analysis. These two categories were messages that ask for evidence of social impact and messages that contain misinformation but search for dialogue to contrast with the information—both messages search for deliberation.
Step 3. In-depth analysis of interactions containing evidence of potential or real social impact.
2.2.2. Interrater Reliability (Kappa)
The analysis of social media data collected for the second analysis was conducted following a qualitative content analysis method, in which reliability was based on a peer-reviewed process. The sample was composed of 453 messages. Each message was analyzed to identify whether it contains evidence of potential or real social impact (ESISM = 1), if it was a message of misinformation or fake information in health (MISFA = 3), if it was an opinion (OPINION = 4) or if it was information (INFO = 2). The researchers involved were experts in the social impact of research and fake news. Each researcher was provided with the codebook before starting to code the messages. Once the analysis was finalized, the messages were coded and compared. We used interrater reliability in examining the agreement between the two raters on the assignment of the categories defined using Cohen’s kappa. The result obtained was 0.79; considering the interpretation of this number, our level of agreement was almost perfect and, thus, our analysis was reliable. In cases where no agreement was achieved, the raters decided to exclude the results (N = 450).
Before answering the research questions, there are initial steps to determine whether, among the messages selected in the sample, there are more messages and interactions based on misinformation or, on the contrary, there is more evidence of potential or real social impact. For this purpose, we first classified the messages (see Table 4
), and second, we calculated the SICOR.
In relation to the Twitter sample analyzed, we found that tweets with a higher percentage of ESISM were those that were published under hashtag #Ebola (39%), followed by #nutrition (18%). In the case of #vaccines, a lower percentage of ESISM was found (only 4%), but MISFA had a higher percentage (32%). The higher percentage of three hashtags selected is the node of INFO, and OPINION is higher in the #vaccines hashtag with 29%.
In relation to the data analyzed on Facebook, the lower percentage is under the code of MISFA (5% on Facebook page 1 and 3% on Facebook page 2), and the percentage of INFO is higher on Facebook page 1 than on Facebook page 2. However, OPINION is higher on Facebook page 2. The case of ESISM is only present in the case of Facebook page 1 with 15%.
In the case of subreddits, we selected examples focused on vaccines because this topic was the most controversial on Twitter. One of the results indicates that the percentage of ESISM (27% and 14%) was higher than the MISFA code (6% and 10%), INFO had the highest percentages in the subreddits (37% and 62%), and OPINION had the second-highest percentages (31% and 14%).
If we analyzed the total amount of data collected, we obtained the following SICOR for each social media channel selected, as shown in Table 5
. The SICOR calculation [2
] is a ratio that calculates the percentage of ESISM found in the full sample selected. In this case, the SICOR is the percentage of tweets with evidence of social impact in relation to all the tweets collected; the same applied to Facebook posts and comments and subreddit comments.
As we can see, the selected subreddit comments have a higher percentage of SICOR (23%), followed by tweets (20%) and Facebook posts and comments (8%).
In the case of MISFA, the percentage of total amount for each social media channel selected is shown in Table 6
According to the results, the MISFA percentage is higher in tweets (19%), followed by subreddit comments (7%) and Facebook posts and comments (4%).
3.1. Fake Health Information Social Media Messages are Mostly Aggressive
Regarding the research questions, the nature of the social media messages focused on false health information is that they are mostly aggressive in the sample analyzed. This result is mainly concerning the messages on vaccines in which the possibility of dialogue does not exist; there is no option, and the messages contained an affirmative closed sentence. First, we have detected hostility to arguments based on science and even defamation of scientists who have contributed to advances in the field of vaccines. One of the examples of fake news in vaccines said the following: “Vaccines have a long history of damaging the brain from day one”, followed by “The-so-called Father of Vaccination left his first son brain-damaged by vaccinating him, Jenner was smart enough not to vaccinate his second”. This is an example false of information that has negative impacts on the truth. First, Jenner devoted his life to overcoming smallpox. Jenner even freely treated poor people to save them from smallpox. His discovery had a substantial social impact, and his sons died due to tuberculosis, not due to smallpox vaccines. This scientific article explains in detail the contribution of Edward Jenner [49
] and concludes how his discovery and the promotion of vaccination facilitated the eradication of smallpox. Thus, spreading false information about this crucial discovery with defamation of real history harms citizens’ lives because this lie could damage their health.
Another example found that parents should be encouraged to boycott doctors who recommend vaccinating their children. There is an active anti-vaccine movement that is continually sharing this type of message in social media. The negative consequences are that some children who are not vaccinated have contracted diseases that could be avoided, in addition to adults who have done the same. For instance, in 2019, there were 1282 cases of measles in an outbreak in 31 states [50
]. This number was the highest number of cases reported in the U.S. since 1992. Most cases occurred among those who were not vaccinated against measles [50
]. The negative impact of this type of interaction affects the public health of cities and villages where people decide to follow these anti-scientific arguments.
For this reason, these types of messages are also aggressive, as they do not to follow scientific arguments related to health and cause physical damage and disease among children and those who share a common space. Regarding the examples of the type of MISFA, we found some messages that contain false information or misinformation but contain questions to open dialogue that may pose contrasts to one’s own assumptions with scientific evidence delivered by other persons or the beginning of deliberation. For instance, one of the examples begins with “I would be interested in a healthy and respectful conversation”, saying that he is a “vaccine agnostic”, sharing his opinion that he believes there are more risks than benefits. However, he is open to dialogue. This message is based on opinion, not scientific evidence. However, he is honest, saying that it is an opinion, and he is not assuming that he knows the truth; this is a first step towards dialogue. This case reflects another type of message found, people who are influenced by false information but open to having a conversation. Nevertheless, this opinion also has negative consequences, as he stated that his children are not vaccinated and, thus, they are also at risk.
3.2. Potential or Real Social Impact Social Media Messages Are Respectful and Transformative
Messages that contain the potential or real social impact of health are respectful and transformative. They deliver quantitative or qualitative evidence of the social impact that contributes to knowing health is being improved. Some of the illustrative examples of this are those published under the Ebola topic. One of the examples analyzed was the impact that the Ebola vaccine was finally approved. Ebola is a health concern, especially in DRC, due to the number of people affected and who die due to this disease. One of the examples shares quantitative evidence of the social impact of this vaccine and congratulates the people that made this result possible “Merck’s Ebola vaccine, which has been given to more than 258,000 people in the current outbreak in DRC”.
Another example of this type of message said that the Ebola vaccine is the best of 2019. This discovery offers hope and optimism for overcoming Ebola in DRC by offering qualitative evidence of the potential social impact of this vaccine. For example, a survivor of Ebola who took part in the vaccine trial was quoted as saying, “I can convince other people in my town that there is a treatment available for Ebola and that they can get better”, and a link was added to the WHO article with the full testimony of this survivor [51
]. Similarly, another example of qualitative evidence of social impact is delivered by a message that contains a documentary of Ebola through different testimonies, such as Jophet Kasere, who survived Ebola. However, his family did not; he works as a nurse, caring for children whose parents have been infected with the virus. The documentary recorded by Frontline shows how treatment delivered by WHO was improving the health of different members of the community, but at the same, shows how people who were against this international help tried to stop this improvement and leave people at risk [52
3.3. In Deliberation Contexts, Messages with Evidence of Social Impact Overcome Fake Information in Health
One of the results found is that deliberation contexts in social media promote the possibility of contrasting information and open dialogue based on valid claims. This example has specially been observed in the Reddit conversations analyzed. The social network allows conversations abiding by the rules of the communities. For instance, one of these rules states that conversations should be based on scientific information and not on false information. We have found examples of people with doubts or concerns regarding vaccines using Reddit to share their views and learn. In Reddit, we found MISFA D and ESISM D, because the common goal is to dialogue.
For instance, one of the examples found was a conversation initiated by a girl who was not vaccinated. She said “my parents never vaccinated me,” and she was concerned about this and her health. Her questions were addressed to the community, seeking help with regard to her situation. She received replies focused on helping her. For instance, she was told to visit her GP in order to receive an appropriate catch-up schedule, and the importance of talking to a doctor was stressed, “this is not something you should be deciding yourself or asking the Internet about; just ask your doctor”.
The second example selected was a conversation initiated by a person who holds anti-vaccine views but was searching for “some answers (if possible) to vaccine questions”. This person affirms that “there has never been a vaccinated Vs. completely unvaccinated study” to extract reliable conclusions about whether it is better to vaccinate or not. This person received replies with evidence of social impact focused on comparative studies between people who were vaccinated and not vaccinated, where those who were vaccinated exhibited better health than those who were not. Direct links to these studies were also provided. Some of the information detailed the following, “German study on lower rates of asthma among the vaccinated”, “comparing unvaccinated and vaccinated people who do catch the flu–vaccinated people are protected from the most serious effects, vaccinated versus unvaccinated children: how they fare in first five years of life, Nigerian study of 25 unvaccinated and 25 vaccinated children: one vaccinated child had a mild case of measles. Unvaccinated children: 3 dead, plus 11 non-fatal cases of measles”. The second reply selected detailed cases of measles in the U.S. and explained how the number of cases increased due to unvaccinated children, and this was then compared with Romania, providing scientific sources where the data are published. The result of this conversation is that the person who began this conversation read extended replies that were well-argued based on evidence of social impact and official data, replying enthusiastically, “Thanks! I’ll read these and think on the issue!” Thus a transformation was possible due to arguments based on evidence of social impact.
Another example selected is from the ASK ME SCIENCE conversations; this is a conversation where scientists are available for dialogue with citizens about different topics. In this case, vaccines were used. One of the conversations selected was concern around Andy Wakefield and his research—that is, an ex-physician who became an anti-vaccine activist, among those responsible for purporting a link between vaccines and autism. The clear and overwhelming consensus among scientists is that “His malevolent influence on the vaccine world was terrible, and we have still not fully recovered even though his publications and ethics have been debunked. Because of his paper, millions of people were not vaccinated, and thousands have died. What a legacy to live with”.
Furthermore, the final example selected was regarding the negative impact of a community that opts to remain vaccinated. This dialogue was started by someone sharing the concept of “herd immunity”. This concept details that in a community, there are people who cannot be vaccinated, such as due to allergies or those who are immunocompromised, and they depend on herd immunity to protect them. This was followed by highlighting that “If too many people who could get the vaccine but choose not to, it does not just affect the individual but can compromise others in the community as well”. This person explained that in his/her county, a large population chose not to vaccinate and, consequently, there was a measles outbreak; further, an emergency was declared. He/she has a friend who is immunocompromised and needed to stay home for fear of contracting the measles, “it was terrifying and preventable if people who could get the vaccine would choose to do so. It is a choice that does have an impact on others”. This conversation opens a dialogue on how our decisions based on false information can have a negative impact on the health of others. Thus, it is crucial to apply the evidence of social impact in collective matters to guarantee, in this case, successful public health.
The previous studies reviewed have been useful in clarifying how health information is spread in social media, identifying the positive and negative impacts [6
]. Regarding the adverse effects of using social media to spread misinformation, there is evidence of the harmful consequences to global health and well-being, becoming one of the most significant challenges for public health systems today [5
]. Some of the studies have advanced the identification of the types of profiles that spread vaccine-related disinformation [12
], and this helps to identify whether the profile that is posting could be a trusted source or not. Our study contributes to advances in the direction of overcoming false information in health through the analysis of how the messages and interactions are based on false health-related information and the transformative dimension of those messages based on evidence of social impact. This identification has made it possible to apply the SISM methodology, which is focused on evidence of social impact. The three social media channels show that there is a public online discussion regarding the object of study. The detailed analysis of the selected sample allowed us to identify deliberation contexts in the three social media channels; for instance, in Reddit, the open conversations encourage people to search for a dialogue based on valid claims.
Moreover, in this context, messages based on evidence of social impact overcome false information, even among those with previous anti-vaccine ideas but with an open-minded attitude and respect. However, it is not possible to engage in dialogue with those who have an aggressive position against science. This finding is especially crucial because it allows us to identify whether citizens have access to evidence on social impacts and whether they can share this evidence in conversations in which false information is spread. The evidence of social impact is the vaccine against false health-related information. Future research lines could replicate this analysis in other topics in which false information is damaging. Nevertheless, civil rights movements could also promote these findings to quickly overcome false health-related information that is causing deaths in adverse but avoidable situations.
On the basis of the research findings, there are several practical implications and recommendations for public health professionals. First, the results allow public health professionals to determine the type of health information with evidence of social impact that is most shared in social media. Second, the results also contribute to understanding the types of fake news with a stronger presence in social media that can reduce the effectiveness of public health social media campaigns. Third, this knowledge can be useful in the design of strategies in the public health sector to reverse fake news. Fourth, this knowledge can also be useful to narrow efforts to disseminate evidence of social impact in health to deactivate fake news. Finally, this study contributes to identifying discussion forums in which debates are occurring around health information to contribute to the dialogue providing health information with evidence of social impact.
This article demonstrates that SISM is a replicable methodology that has been successfully applied in social media analytics in relation to health and fake news, contributing to the further exploration of the possibilities of this methodology. This study offers the possibility to identify, on the one hand, evidence of social impact shared in social media and, on the other hand, misinformation or fake information related to health. Furthermore, the results show how the interactions in social media depend on the type of information shared or commented upon by diverse actors.
The analysis of Twitter, Facebook and Reddit unveils the different types of interactions regarding evidence or fake news, but they all have the common pattern of showing more messages of events or fact-related information about Ebola, nutrition and vaccines. Furthermore, in most cases, the existence of interactions regarding evidence is higher than that of interactions regarding the misinformation of fake information, although the percentage is much higher for the misinformation of fake information than for evidence in the case of Twitter #vaccines. With regard to opinions, the results indicate that they are much more frequent on Facebook and on subreddits than on Twitter. Moreover, the percentage of tweets and Facebook posts providing information about potential or actual social impacts in relation to the total amount of social media data (SICOR) is higher in Tweets and subreddit comments than in Facebook posts and comments. Another relevant finding is that messages focused on false information regarding health are mostly aggressive, and messages based on evidence of social impact are respectful and transformative. Finally, deliberation contexts in social media allow for the transformation of even those who have false information but who are open to dialogue when they participate and access evidence of social impact.
The findings provide insights into the way in which public health initiatives can support the presence and interactions of evidence as an effective strategy to combat fake news. Two main recommendations are suggested for public health professionals, among others. On the one hand, we narrow the dissemination strategies to reverse and deactivate fake news regarding health, considering that the percentage of misinformation on fake news is much higher than that observed for Twitter #vaccines. On the other hand, the design of concrete interventions for discussion forums in which health information is discussed (not only shared) can provide health information with evidence of social impact.
This research contributes to including citizens’ voices into research from a bottom–up approach, in line with the need to support science and social dialogue in relation to public health, including vulnerable groups [53
] or the role of patients to overcome barriers to health access [54
]. The possibilities of social media analysis have been widely explored in very diverse fields, from gender to digital protests [55
], and this work contributes to advancing knowledge in social media analysis and fake news in public health. Future investigations can use SISM to analyze the interactions in social media regarding other public health issues to further explore how citizens use and share information.