Protection from ‘Fake News’: The Need for Descriptive Factual Labeling for Online Content

: So-called ‘fake news’—deceptive online content that attempts to manipulate readers—is a growing problem. A tool of intelligence agencies, scammers and marketers alike, it has been blamed for election interference, public confusion and other issues in the United States and beyond. This problem is made particularly pronounced as younger generations choose social media sources over journalistic sources for their information. This paper considers the prospective solution of providing consumers with ‘nutrition facts’-style information for online content. To this end, it reviews prior work in product labeling and considers several possible approaches and the arguments for and against such labels. Based on this analysis, a case is made for the need for a nutrition facts-based labeling scheme for online content.


Introduction
Deceptive online content, commonly called 'fake news', has been blamed for election interference [1,2], confusing the public [3] and even causing an armed standoff [4]. Problematically, individual consumers seem under-prepared to deal with information that is not pre-vetted for them by a conventional news source, though concerns have also been expressed about conventional news sources [5,6]. Individuals have been shown to give undue trust to online content and communications [7]. Additionally, reader laziness [8] and repetition of exposure [9] have been shown to increase consumer belief in deceptive content.
With 55% of Americans indicating that they are, at least sometimes, getting news from unvetted social media sites [10] and 75% of Americans believing fake news headlines [11,12], a gap appears to exist between consumers' perception of media accuracy and the truth. Marchi [13] shows that this will only get worse, as teens prefer to get news from social media instead of journalistic sources and are unconvinced of journalists' objectivity.
Bakir and McStay [14] discuss several prospective solutions to the online content accuracy problems. These include increasing the prevalence of accurate news articles in feeds, verifying facts in articles manually, automated detection of deceptive content, and warning labels. A collection of journalists suggested changing algorithms to give individuals greater exposure to different perspectives, increasing source transparency in reporting, enhancing access to elected officials and collaborative journalism. Bakir and McStay [14] note a clear profit motive for deceptive content sites: online advertising. Given this motive, there seems to be little hope of the issues related to deceptive content being resolved on their own or by those involved in its dissemination. Thus, like in other areas where market forces are not predisposed to solve problems on their own, there may be an important role for government to develop and enforce regulations. One agency, the Federal Trade Commission (FTC), has a key role in ensuring fairness in commerce. The FTC's role in regulating other products and its potential role in news media labeling are discussed. Other possible solutions for labeling enactment are also considered. There are several factors that make nutrition facts effective. Nutrition information processing has been tied to a combination of knowledge and motivation [19]. To provide this motivation, nutrition facts labeling has also been supported by educational initiatives and outreach targeting both children and adults [20].

Other Governmental Labeling in the United States
In addition to the ubiquitous 'nutrition facts' that many people see multiple times per day on food packaging, the federal government regulates consumer disclosures on a wide variety of other products and for a myriad of services. A discussion of these regulations can be found in Appendix A.
Most relevant to online content labeling is the MPAA and V-Chip ratings for movies and television, respectively. Table 1 describes these two ratings systems. Notably, they focus on age appropriateness as opposed to more content-targeted restrictions (though, in some cases, a description of the reason for the rating is provided).

Other Governmental Labeling in the United States
In addition to the ubiquitous 'nutrition facts' that many people see multiple times per day on food packaging, the federal government regulates consumer disclosures on a wide variety of other products and for a myriad of services. A discussion of these regulations can be found in Appendix A.
Most relevant to online content labeling is the MPAA and V-Chip ratings for movies and television, respectively. Table 1 describes these two ratings systems. Notably, they focus on age appropriateness as opposed to more content-targeted restrictions (though, in some cases, a description of the reason for the rating is provided).

MPAA Rating V-Chip Rating Meaning
N/A TV-Y For the very young, targeted at ages 2-6 N/A TV-Y7 For children aged above 7 N/A TV-Y7-FV TV-Y7 with "fantasy violence that may be more intense or more combative" G TV-G Suitable for "all ages" and "general audiences" PG TV-PG Programs where "parental guidance" is recommended and which may not be appropriate for young children PG- 13 TV-14 Programs which may have content inappropriate for children under age 13 or 14 R TV-MA Programs for older audiences, 17 or above NC-17 N/A Children under 17 are not allowed, even with parental supervision The federal government has established a role for itself in ensuring that consumers have accurate information and are warned about dangerous products. However, in the area of content, the government has been far more restricted in its labeling and warnings (both those it requires and those it participated in developing), focusing on protecting children and ensuring that accurate and standardized details are provided for commercial transactions. This is an undoubtable influence of the First Amendment to the United States Constitution which instructs that "congress shall make no law . . . abridging the freedom of speech, or of the press" [24].

Fake News
There is no universally accepted definition for 'fake news'. Instead, many individuals rely on their intuitive understanding of the meaning of the term. Researchers have also adopted their own definitions and associated concepts which conflict and overlap with others [25]. Lazer et al. [26], for example, defined fake news to be fabricated information that mimics news media content in form, but not in organizational process or intent. Shu et al. [27], alternately, argue that, "fake news is a news article that is intentionally and verifiably false". These definitions draw from different perspectives: article accuracy and author intent. Thus, an article may be considered "fake" because it presents misleading information or because of the deception-based response it intends to invoke in the reader.
These definitions can be contrasted with those provided by Tandoc, Lim and Ling [11]. They define legitimate news as "an account of a recent, interesting, and significant event", "an account of events that significantly affect people", "a dramatic account of something novel or deviant", and by stating that "a central element in the professional definition of journalism is adherence to particular standards, such as being objective and accurate".

The Fake News Problem
Fake news is fundamentally an issue of media credibility. Measures of media credibility have been studied for several decades in the field of journalism [28] and, more recently, in investigating the credibility of websites [29] and social media blogs [30]. Gaziano and McGrath [28] investigated the dimensions of credibility, the impact of the local nature of news coverage and the impact of the delivery medium on perceived credibility. Meyer [31] argued that certain aspects of credibility come into conflict, such as balancing believability with respect of the community. Corritore et al. [29] consider several factors of websites' credibility, including honesty, reputation, and predictability. Kang [30] observed that while anyone can have a blog, consumers may find blogs more credible than mainstream news sources due to their independent nature. Unfortunately, many of the factors (e.g., honesty, integrity) used to measure credibility are difficult to self-report in a meaningful way. It is unlikely that a newspaper would label itself dishonest, for example.
The problem, however, extends beyond just perception. Alcott and Gentzkow [1] demonstrated the impact of fake news in a review of the social, economic, and political impacts that fake news had on the U.S. Presidential Election in 2016. They discussed the level of overall fake news exposure to users on social media platforms and how it may have affected the election results.
Lazer et al. [26] identified two potential types of interventions that could help mitigate the exposure and influence of fake news: "(i) those aimed at empowering individuals to evaluate the fake news they encounter, and (ii) structural changes aimed at preventing exposure of individuals to fake news in the first instance". Online content labeling can be seen as using the second intervention proposed by Lazer et al. to facilitate the first. If fake news can be reliability identified and a mechanism for effectively warning users about it can be developed, these warnings can prevent users from inadvertently accessing fake news articles and warn users about articles that they do access.

Identifying and Classifying Fake News
To inform users about the dangers of fake news, it must be identified and classified. Identification is necessary to determine when to display a warning and classification is required to describe why the content is suspect. Fake news content can be manually identified; however, this is time consuming and relies heavily on reviewers' subjective decision making. Statistical approaches for fake news detection have also been proposed. Wang [32] compiled a manually annotated dataset over a period of ten years using a supervised machine learning approach which integrated metadata with text. This study demonstrated that improvements can be achieved for fine-grain automatic fake news detection using surface-level linguistic patterns [32].
Zhou and Zafarani's [33] survey of fake news detection methods identified prior work related to style, propagation, and user-based fake news analysis. Style-based analysis seeks to quantify machine learning features which can differentiate between a fake article and a real article. Propagation-based analysis attempts to detect fake news by quantifying the social network distributing it. User-based fake news analysis focuses upon the credibility of the individuals creating and distributing the news content.
Several schemes for classifying fake news have been proposed. Tandoc, Lim and Ling [11] conducted a review of articles that use the term "fake news" and developed a six-category classification system which included the categories "satire", "parody", "fabrication", "manipulation", "propaganda", and "advertising". Bakir and McStay [14], alternately, proposed a seven category system which included the categories "false connection (where headlines, visuals or captions do not support the content)", "false context (genuine content shared with false contextual information)", "manipulated content (genuine imagery/information manipulated to deceive)", "misleading content (misleading use of information to frame an issue or individual)", "imposter content (genuine sources are impersonated)", "fabricated content (100 per cent false, designed to deceive and harm)", and "satire/parody (with potential to fool but no intention to cause harm)".

Labeling Fake News Online
Once fake news content is detected and classified, this must be effectively presented to users. In this section, prior work on labelling online content is discussed.
Fuhr et al. [34] proposed an "informational nutritional label" for online documents where labels would cover nine categories, which are presented in Table 2. In their proposal, each category would be accompanied by a column of "recommended daily allowances".
Unfortunately, not all of the categories proposed by Fuhr et al. [34] have a logical "recommended daily allowance". It is further unclear as to what a user would be expected to do if they do not meet a relevant standard (such as the reading level) or how others (such as virality) may inform reader decision making. Finally, the fact percentage rating is inherently problematic as a purely factual article may be biased if important related facts are omitted.  [34], quoted text from [34]).

Category Description
"Fact" Percentage of the document which is comprised of factual information "Opinion" Percentage of the document which is comprised opinion statements "Controversy" Rating of the controversiality of the topics discussed in the article "Emotion" Quantity or percentage of emotionally charged words, sentences and terms in the article "Topicality" Time-dependent rating of how widely discussed the topic is, at present "Reading Level" Combination rating of writing quality (grammatical correctness) and an estimate of the reading level (in terms of years of education) required to understand the article "Technicality" Rating measuring how difficult it would be for someone to understand the content/vocabular of the document from outside of its intended target field of study "Authority" a rating of the authority/trust level of the document's source "Virality" Time-dependent rating of the degree to which the article is in a "viral" distribution phase Lespagnol et al. [35] proposed extending this model to include a rating of for "information check-worthiness" which used a method to predict which claims in an article should be prioritized for fact checking. This label category paired "fact" and "opinion" to identify the percentage of statements which may or may not be facts. Vincentius et al. [36] also proposed an extension, observing that readers may desire additional information regarding how category scores are determined. For example, the user may prefer to judge the "authority" score of the article themselves, based on the article's source. To this end, they proposed three additional categories of information, which are presented in Table 3.  [36], quoted text from [36]).

Category Description
"Source" Publisher and author information "Article Popularity" The average number of tweets per hour (replacing the "virality" category) "Political Bias" Degree to which the article is written from a 'conservative' or 'liberal' political orientation Problematically, classifying articles as having only one of two political orientations may be a significant oversimplification. Articles may also be flagged in multiple areas for a single issue, potentially resulting in an overly negative evaluation. If an article is factual, but those facts support a political position, it is unclear whether this should result in an article being flagged as biased.
The extensions proposed by Vincentius et al. [36] also raise the question of how bias would be detected. A solution to this is presented by Fairbanks et al. [37] who developed a technique for identifying "liberal words", "conservative words" and "fake news words". However, they were not able to identify reliable words for fakeness. The efficacy of this technique is unclear.
The impact of labeling on user behavior is also critical to consider. To this end, Mena [38] conducted a study that examined the impact of warning labels on Facebook users' intention to share articles. It used an experimental warning label design which identified two fake news articles as being "disputed by Snopes.com and PolitiFact". Users who received the flagged articles indicated a diminished level of intention to share the misinformation than the control group. This impact was similar for participants across all three self-reported political affiliations (Republican, Democrat, and independent).

Social Media Operators' Content Moderation and Labeling
A June 2020 report by NYU Stern Center for Business and Human Rights [39] stated that Facebook outsources its content moderation to third-party vendors. Facebook employed approximately 15,000 moderators, compared to 10,000 employed by YouTube and Google and 1500 by Twitter. Facebook also partners with 60 journalistic organizations for fact checking of news posts. The report found that "given the daily volume of what is disseminated on these sites", the level of review is "grossly inadequate". In Indonesia, Facebook was used to spread "hoaxes" designed to sway local and national elections through 2018 and 2019 [39]. In Sri Lanka, racially charged hate-speech and calls for violence spread in 2018 and 2019, with direct ties to riots and attacks [39]. Many of these posts "remained visible on the platform through the time of the attacks and for days afterward" [39]. The report proposed eight recommendations for improvement, including to "significantly expand fact checking to debunk mis-and disinformation" [39].
Social media site operators have begun implementing systems for labeling some content (which meets a standard below that for moderation). Twitter has implemented a "manipulated media" tag (shown in Figure 2) that is applied to deep fakes and other intentionally deceptively edited media. They hoped it would limit misinformation during the 2020 U.S. Presidential Election [40]. The tag was first applied to a tweet by White House  [41].
Facebook introduced content labeling in July, 2020 [42] for posts from Trump, Biden, and their respective campaigns that directed users to factual information on how to vote. Facebook also responded [43] when Trump declared victory on election night by pushing a mobile notification stating that "votes are still being counted" that linked to official results. YouTube labeled numerous videos related to the 2020 U.S. presidential election with the phrase "The AP has called the Presidential Race for Joe Biden. See more on Google". In addition, the label provided a link to information from the Cybersecurity and Infrastructure Security Agency regarding election security measures in the United States [44]. These labels were applied to videos containing misinformation, and videos and search terms even related to the election. As shown in Figure 2, this label was applied when searching for terms such as "Trump", "Biden", or "election" on YouTube.   Throughout 2020, as shown in Figure 3, Twitter labeling has grown more specific, with tags related to "glorifying violence", providing "the facts about mail in ballots" and indicating the winner of the 2020 U.S. Presidential Election. These more descriptive labels also link to Twitter timelines referencing multiple news sources, as shown in Figure 4.  Throughout 2020, as shown in Figure 3, Twitter labeling has grown more specific, with tags related to "glorifying violence", providing "the facts about mail in ballots" and indicating the winner of the 2020 U.S. Presidential Election. These more descriptive labels also link to Twitter timelines referencing multiple news sources, as shown in Figure  4.  [46], top left, (b) glorifying violence tag, top right [47], and (c) 'get the facts' tag, bottom [48].

Implementing Labeling for Online Content
Just as labels help consumers to choose foods, light bulbs and appliances and protect them from harmful products, like cigarettes, online content labeling can be similarly helpful. Deceptive online content has numerous negative impacts ranging from causing incorrect beliefs about carcinogens and vaccines [51] to promoting unhealthy food choices [52]. It creates "confused" feelings in consumers and can "undermine public risk judgements" [53]. For site operators, though, fake news can generate substantial profits

Implementing Labeling for Online Content
Just as labels help consumers to choose foods, light bulbs and appliances and protect them from harmful products, like cigarettes, online content labeling can be similarly helpful. Deceptive online content has numerous negative impacts ranging from causing incorrect beliefs about carcinogens and vaccines [51] to promoting unhealthy food choices [52]. It creates "confused" feelings in consumers and can "undermine public risk judgements" [53]. For site operators, though, fake news can generate substantial profits and advertiser support these sites due, at least partially, to the reduced cost for ad space, as compared to legitimate news outlets [54]. Site operators can use fake news to intentionally trigger emotional responses that increase viewing time, and thus operator profit [14]. Viewers, though, may suffer from impaired physical and cognitive development [55] as well as depression and anxiety [56]. For many of these reasons and others, fake news is seen as a threat to democracy and was identified by the World Economic Forum as "one of the top risks facing the world today" [53].
To combat these issues, Fuhr et al. [34] and others have promoted informational labeling of online content. This section discusses prospective methods for doing just this. First, label design considerations are discussed. Next, component data is reviewed. Then, potential labeling authority sources are considered.

Labeling Design Paradigms for Online Content
The first question is one of presentation: the disclosure could be a warning, such as the Surgeon General's warning used for cigarette packs (shown in Figure 5). Alternately, it could be informational, like the energy guide (shown in Figure A1 in Appendix A) or lighting facts (shown in Figure A2 in Appendix A). The disclosure could, instead, take an informational form unless a key indicator showed potential harm, in which case a warning would be displayed. The content accuracy disclosure, in whatever form, would join a variety of other descriptors that can be used to evaluate web pages. These include information about its security (SSL certificate, privacy policy evaluation), business trustworthiness (such as a Better Business Bureau rating) and user content ratings. It will be critical to ensure that users do not mistake a rating in one area for suitability in another. For example, a content trustworthiness rating does not indicate that a website is well secured for making online purchases.
In the following sections, three forms of labeling for online content are considered. Recommendation labels serve to inform the user as to whether, how, or in what context the page should be consumed. Informational labels make descriptive claims about the content of the media, allowing the user to interpret this data themselves. Hybrid labels provide recommended actions along with data to justify the recommendations.

Recommendation Labels
Recommendation labels, as shown in Figure 6, are like MPAA ratings in that they recommend a course of action and provide little to no context for why. An MPAA rating of "PG-13" recommends that no one under the age of 13 should view the film without a parental or guardian present. It does not list what in the film triggered the recommendation. V-Chip ratings may be used by a parent to control a child's access to television programs, when viewing unattended. Otherwise they similarly serve as a recommenda- The content accuracy disclosure, in whatever form, would join a variety of other descriptors that can be used to evaluate web pages. These include information about its security (SSL certificate, privacy policy evaluation), business trustworthiness (such as a Better Business Bureau rating) and user content ratings. It will be critical to ensure that users do not mistake a rating in one area for suitability in another. For example, a content trustworthiness rating does not indicate that a website is well secured for making online purchases.
In the following sections, three forms of labeling for online content are considered. Recommendation labels serve to inform the user as to whether, how, or in what context the page should be consumed. Informational labels make descriptive claims about the content of the media, allowing the user to interpret this data themselves. Hybrid labels provide recommended actions along with data to justify the recommendations.

Recommendation Labels
Recommendation labels, as shown in Figure 6, are like MPAA ratings in that they recommend a course of action and provide little to no context for why. An MPAA rating of "PG-13" recommends that no one under the age of 13 should view the film without a parental or guardian present. It does not list what in the film triggered the recommendation. V-Chip ratings may be used by a parent to control a child's access to television programs, when viewing unattended. Otherwise they similarly serve as a recommendation without much context. In labeling online content, a recommendation label may state that the content is "safe" or "unsafe", "real" or "fake" news, and recommend whether the user should continue to the site. A more restrictive recommendation may block access to certain online content rather than simply warning the user against proceeding. For example, a video containing graphic violence may be blocked entirely. . Recommendation label warning that the content's source is unverified. Allows the user to continue to view the content or click a link to learn more information.

Informational Labels
Informational labels, like V-Chip content descriptors, provide information about contents without recommending user action. The V-Chip content descriptor "L", for example, indicates that the television program contains "course or crude language", but does not suggest whether or by whom it should be viewed. Instead, the user determines whether "course or crude language" is acceptable. In nutrition facts labeling, the lower section of the label lists the ingredients of an item, without goodness recommendations. Similarly, lighting facts labeling indicates the brightness, cost, and energy usage of a bulb but does not state what makes a 'good' bulb. Informational labels for online content may be presented-similar to as suggested by Fuhr et al. [34]-without any contextualizing information provided alongside it. The informational label may state that an article "contains 50% fact, 20% opinion, 30% unverified claims, and is currently viral". It would then be left up to the user to determine how to act upon that information.
Rather than claiming a video is "deceptively edited", an informational label may state that a video "has been edited from the original". Instead of stating an article is "poorly cited", the number of citations could be provided without a statement as to how many are desirable. Fuhr et al. [34] proposed a list of label categories for this style of online content labeling including "emotion" and "technicality". An example label based on this is presented in Figure 7. This format leaves up to the reader to determine the importance of the categories and their values. Some labels may not be obvious to interpret, though.  [34]. No information is provided beyond the category values, which must be interpreted. No recommendation is made.

Hybrid Informational and Recommendation Labels
Hybrid labels combine information and context to make recommendations. Nutrition facts labeling provides details such as the amount of fat and sugar in a food product, but they also include recommended daily allotments for each category to guide decision For news content, warning labels could include "opinion", "poorly cited", "deceptively edited" or otherwise communicate potential dangers of the content. Approval labels could state that the media is "well cited", "from a trusted source" or otherwise inform the user of positive qualities of the content. A warning label can provide additional supporting information through a separate link, if the user chooses to navigate there. The user is not prevented from viewing the content, but their path is momentarily interrupted with the warning.

Informational Labels
Informational labels, like V-Chip content descriptors, provide information about contents without recommending user action. The V-Chip content descriptor "L", for example, indicates that the television program contains "course or crude language", but does not suggest whether or by whom it should be viewed. Instead, the user determines whether "course or crude language" is acceptable. In nutrition facts labeling, the lower section of the label lists the ingredients of an item, without goodness recommendations. Similarly, lighting facts labeling indicates the brightness, cost, and energy usage of a bulb but does not state what makes a 'good' bulb. Informational labels for online content may be presented-similar to as suggested by Fuhr et al. [34]-without any contextualizing information provided alongside it. The informational label may state that an article "contains 50% fact, 20% opinion, 30% unverified claims, and is currently viral". It would then be left up to the user to determine how to act upon that information.
Rather than claiming a video is "deceptively edited", an informational label may state that a video "has been edited from the original". Instead of stating an article is "poorly cited", the number of citations could be provided without a statement as to how many are desirable. Fuhr et al. [34] proposed a list of label categories for this style of online content labeling including "emotion" and "technicality". An example label based on this is presented in Figure 7. This format leaves up to the reader to determine the importance of the categories and their values. Some labels may not be obvious to interpret, though.
state that a video "has been edited from the original". Instead of stating an article is "poorly cited", the number of citations could be provided without a statement as to how many are desirable. Fuhr et al. [34] proposed a list of label categories for this style of online content labeling including "emotion" and "technicality". An example label based on this is presented in Figure 7. This format leaves up to the reader to determine the importance of the categories and their values. Some labels may not be obvious to interpret, though.  [34]. No information is provided beyond the category values, which must be interpreted. No recommendation is made.

Hybrid Informational and Recommendation Labels
Hybrid labels combine information and context to make recommendations. Nutrition facts labeling provides details such as the amount of fat and sugar in a food product, but they also include recommended daily allotments for each category to guide decision making. Cigarette warning labels provide a stark warning to not consume cigarettes. They also generally include a contextual statement such as "cigarettes cause cancer" to explain why this recommendation is made. Both are examples of hybrid labels, with nutrition facts prioritizing the delivery of the information and cigarette packaging prioritizing warning visibility. A hybrid label for online content could include similar labeling  [34]. No information is provided beyond the category values, which must be interpreted. No recommendation is made.

Hybrid Informational and Recommendation Labels
Hybrid labels combine information and context to make recommendations. Nutrition facts labeling provides details such as the amount of fat and sugar in a food product, but they also include recommended daily allotments for each category to guide decision making. Cigarette warning labels provide a stark warning to not consume cigarettes. They also generally include a contextual statement such as "cigarettes cause cancer" to explain why this recommendation is made. Both are examples of hybrid labels, with nutrition facts prioritizing the delivery of the information and cigarette packaging prioritizing warning visibility. A hybrid label for online content could include similar labeling categories as described by Fuhr et al. [34] but provide information paralleling 'recommended daily allowances'. Contextual information could include statements such as "viral news sources may be artificially propagated", "real news sources tend to contain less than 20% opinion statements" and "articles with more than 5% emotionally charged words tend to be biased". This context combined with the category values would allow the user to make a more informed decision.
For news content, a hybrid label could include the number of citations and the average number of citations for an article of similar age and on a similar topic. In addition to stating that a video has been "edited from the original source", the label could state that "this video may not accurately represent the facts". An expanded version could include multiple categories of pertinent information along with context for each. The recommendation may be implicit through this contextual information. For example, in Figure 8, the fact that the article has an 'emotion' score of 12% is given context by stating that 'often credible sources have less than 7%'.

Methodology for Determining What Labels to Develop and Evaluate
This subsection discusses the methodology that was used in the development of the labels presented earlier in this section. This process started with the identification and analysis of the different labeling systems used in other areas, which was presented in Section 2. Based on these prior systems and the analysis of them, the content labels were developed.
The recommendation label shown in Figure 6 is designed as a warning label. This design is like the type of warning message a user would receive when browsing the internet and encountering a web page blocked due to having suspicious content. This label is similar to the warning that the Windows operating system will produce if a user attempts to execute an application file with from unknown publisher. It halts the user's action momentarily and presents a rationale for why the user might wish to reconsider the activity. With a recommendation such as this, the user is still given control over the action and may choose to proceed despite the warning. It, thus, will not prevent an adamant user from staying the course, but it will forewarn unsuspecting users and those who may not be paying full attention to the locations that they are browsing. erage number of citations for an article of similar age and on a similar topic. In addition to stating that a video has been "edited from the original source", the label could state that "this video may not accurately represent the facts". An expanded version could include multiple categories of pertinent information along with context for each. The recommendation may be implicit through this contextual information. For example, in Figure 8, the fact that the article has an 'emotion' score of 12% is given context by stating that 'often credible sources have less than 7%.'

Methodology for Determining What Labels to Develop and Evaluate
This subsection discusses the methodology that was used in the development of the labels presented earlier in this section. This process started with the identification and analysis of the different labeling systems used in other areas, which was presented in Section 2. Based on these prior systems and the analysis of them, the content labels were developed.
The recommendation label shown in Figure 6 is designed as a warning label. This design is like the type of warning message a user would receive when browsing the internet and encountering a web page blocked due to having suspicious content. This label is similar to the warning that the Windows operating system will produce if a user attempts to execute an application file with from unknown publisher. It halts the user's action momentarily and presents a rationale for why the user might wish to reconsider the activity. With a recommendation such as this, the user is still given control over the action and may choose to proceed despite the warning. It, thus, will not prevent an adamant user from staying the course, but it will forewarn unsuspecting users and those who may not be paying full attention to the locations that they are browsing.  Figure 7 is to provide information without telling the user how they ought to act upon it. This style places most of the decision-making burden on the end user. Pure information labeling demands higher analytical reasoning on the part of the consumer but presents this information in a less intrusive manner.

The intent of the informational label design shown in
A hybrid informational label design, as shown in Figure 8, lowers the threshold for the level of analytical reasoning skill necessary to interpret and act upon the data presented. It is more intrusive than pure informational labeling and it may be seen, by some, to be recommending a course of action rather than simply giving the user the tools to determine a course independently. This design is most effective when a pure warning label would be too intrusive for the public to accept but a pure information label would require too much analytical reasoning skill for members the public to interpret.

Assessment of Online Content Nutrition Facts Form and Component Data
Numerous components could prospectively be included in standardized disclosures for online content, whether informational or recommendations.
Irrespective of the exact form that the disclosure takes, it is important to determine what information elements should be included. These components could form the basis for making the warning/no-warning decision, if a warning-style system was used, or presented to the consumer if an informational format was used.
Key elements that may be effective for the consumer to view or to analyze include the title of the article, author's name, publication's name and a labeled or detected political perspective [58]. A rating (based on manual or automatic ratings of prior work) could potentially be provided for both the author and publication, to further inform evaluation. In addition, the classification from one or several analyzers could be included. Analyzers have been developed based on content simplicity [59], satirical cues [60] and language analysis [61]. They have been developed using technologies including deep learning [25], data mining [27], reinforcement learning [62] and Bayesian classifiers [63]. Tacchini et al. [64] have even proposed techniques specific to social media networks.
Additionally, content filtering based on the presence of profanity [65] may be desirable, both to identify prospectively deceptive content and to help determine the appropriateness for children. Other age-level appropriateness classification information could be supplied by the content provider, detected through analysis of words and phrases present, or both.
A meta tag exists, already, to allow web page authors to advise the appropriateness of the content of each web page [66].

Fact, Opinions and Fact Interpretation Opinions
A key consideration for providers, during news article classification, will be distinguishing facts from pure and properly identified opinions from fact interpretation opinions. The issues of fact identification have been previously discussed. Pure opinions should accurately indicate their source and be indicated to be opinion content. For most labeling providers, nothing further than this would be required (conceivably, a provider might label opinions that it disagreed with in some way, despite the undesirability of doing so from a truthfully informing the consumer perspective).
Fact interpretation opinions can be a particularly problematic area if an individual providing the interpretation has (or claims to have) specialized skills. For example, an opinion from a medical provider (or claiming to be from a medical provider) would carry significant weight in interpreting a medical emergency situation. Given the unpreparedness of consumers to fully evaluate these opinions, it would be critical that sources be validated to ensure that they have the skills that they claim (going one step further than just validating identity for standard opinion articles). Further, the labeling mechanism could be augmented to provide an alert if the opinion is detected to differ significantly from the majority of others expressing an opinion on the topic. However, as the majority is not always correct, it would be prudent to take a value-neutral stance and simply provide other viewpoints such as how social media firms did related to the U.S. election outcome, as discussed in Section 2.3.4.

Reference Considerations
Section 3.1 discussed the use of references to sources as an indicator of the accuracy of a new article. Sources, of course, can include primary collection by the author or citation to other news articles, reference materials, government documents, corporate reports or press releases or scholarly articles. Clearly, the quality of all of these types of sources are not the same, in terms of the trust that should be placed in them. Even within categories, the trustworthiness may vary significantly: a news story from a well-known news organization or an article in a top journal would likely have undergone more scrutiny than other news stories or articles in other outlets. Thus, a more advanced form of labeling could also consider the trustworthiness of the materials being cited, in addition to the number of cited sources when evaluating an article.
The system might even want to refer back to its own assessment of the source, if it is a type of source processed by the system, to ascertain its credibility as an article and, thus, its credibility as a source. This type of behavior, over time, could produce a network effect that would be helpful in distinguishing higher-quality sources from lower-quality sources.

Multimedia Content
Multimedia content included in a webpage, such as pictures, audio files and videos, also need to be considered as part of any sort of labeling of the page content. A manual labeling approach would not be significantly different from a text-only page, in terms of the process: the human evaluator would simply need to evaluate the multimedia content along with the rest of the page. Automated solutions would need to explicitly consider these types of content. Algorithms exist for the identification of objects within images (see, e.g., [67]), detecting video manipulation (see, e.g., [68]) and transcription of speech into text (see, e.g., [69]). One solution for multimedia files would be to use these types of algorithms to supply summaries that could be evaluated along with the text of the page. Other types of media (and certain types of deceptive content) may require alternate or additional processing and represent a key area for future work.

Labeling Central Authority
Historically, as discussed in Section 2, labeling regulations have either been selfimposed or come from government-mandated standards. This section considers potential central authorities and third parties as sources of content labeling.
3.6.1. Industry Self-Regulation Some online news media organizations label articles as "opinion". The news outlet is the arbiter as to what constitutes an opinion article. The reasoning for how these determinations are made is not necessarily consistent or made available to the reader.
As was discussed in Section 2.3.4, Twitter and others label content which they deem to be manipulated or convey misleading information [40]. Tweets may be labeled or removed based upon three factors: significant fabrication, being shared in a deceptive manner, and impact on public safety. As shown in Table 4, only content which is "likely to impact public safety or cause serious harm" is likely to be removed altogether. Content which is simply manipulative or deceptive may be labeled. Table 4. Twitter's guidelines for content labeling and removal decisions (based on and all quoted text from [70]).
"Is the content significantly and deceptively altered or fabricated?" "Is the content shared in a deceptive manner?" "Is the content likely to impact public safety or cause serious harm?" "Content may be labeled" Y N N "Content may be labeled" N Y N "Content is likely to be labeled or may be removed" Industry self-regulation risks that the industry's labeling mechanisms may not serve the desired societal purpose. For example, labeling articles as "opinion" may cause other unlabeled articles to be taken as factual. An industry-wide regulatory body, such as the ESRB for the video game industry, could provide definitions for labels that are shared by all participating outlets, enhancing the viability of this approach.

Government Regulation
The ESRB was created due to the threat of action from the United States Congress, which planned to pass regulatory legislation unless the video game industry regulated itself [71]. Nutrition facts labeling was introduced after attempts at industry self-regulation failed. There is, thus, historical precedent for both congressional and industry regulation. However, government regulation of the media industry, in the United States, has implications related to the First Amendment to the United States Constitution, which protects freedom of speech and freedom of the press. Also problematic is that different nations may have different interpretations of what constitutes "safe" versus "dangerous" content. While claiming to protect the public from fake news, a government may instead provide residents with targeted misinformation. Nevertheless, government regulation may be deemed to be required, in some areas, if the problem continues to intensify and an alternate solution cannot be found.

Third-Party Applications
Third-party applications, such as web browser extensions, can be used by users and organizations to filter online content. A third-party labeling application could be designed to overlay news articles and videos. Open-source development could allow a user community to collectively define "opinion" and other terms. One significant challenge is the need to classify content in an automated fashion. Industry self-regulation and government regulation could require that labels be applied before the media is published. Third-party labeling would need to determine this at the time of viewing using an algorithm.
Prospectively, a common format for labeling could be developed so that plug-ins and browser features for labeling could consume label data from multiple sources. This would allow consumers, organizations and government regulators to choose sources of label information that they deem appropriate and have this information presented in a common format. Multiple sources of data could even be presented to facilitate user comparison.

Labeling Authority Assessment, Bias, Transparency and Neutrality
Irrespective of what entity provides labels, there will be some who do not agree with the labels that may be placed. In some regions, labeling could be a governmental function established by local regulations and consumers would not have the ability to choose their labeling source. In many regions, where free speech is a societal value, multiple labeling providers may be available.
Just like a labeling provider will likely have potential biases, any sort of labeling or assessment of the label provider may be similarly biased. Given this, transparency as to who is providing the labels allows consumers to make an informed choice, where possible, or to at least be aware of whose perspective the labels represent, in all locations. In regions where choice is possible, a consumer feedback mechanism could be provided for rating providers to allow consumers to comment on any concerns about the provider or provider biases.
While some providers may try to be (or claim to be) content neutral (for example, a provider focused on protecting children from certain types of content), true neutrality is unlikely. Also, provider perspective could change over time due to management and staffing changes or other factors. Some consumers may desire to use a provider that attempts to minimize their own bias; others may seek to have labeling by a provider aligned with their personal political, religious or other values.

Risks, Challenges and Limitations
The proposed system presents a number of risks and challenges, which are discussed in this section. The limitations of the analysis that was performed are also discussed.

Risks and Challenges
First, the potential risks and challenges that may arise from labelling online news content are discussed. The first potential challenge is content providers or others 'gaming the system'. Malicious actors could target a specific article to promote using a botnet, and artificially enhance its credibility and other metrics. This poses a real risk to the effectiveness of an online labelling system and could cause users to trust manipulative content. A content labelling system could be combined with or work cooperatively with an automatic fake news or botnet detection system to potentially avoid this.
A challenge that might occur with an automated, manual, or hybrid labelling system is false positives and false negatives, either due to human or automated system error. In either case, the user would be given bad information regarding the content and may lose trust in the labelling system. This could cause them to disregard its recommendations and warnings or opt out of its use. To mitigate this, a system of checks could be used. Either a human or machine learning-based review process could be used to correct falsely labelled articles.
A third challenge is the profit motive from advertising on deceptive content sites. This was noted by Bakir and McStay [14]. This profit incentive poses a direct threat to online consumer protection. Government oversight could mitigate this risk.
A fourth challenge is news media sources rejecting information labels and disclaimers. News content creators are motivated to appeal to as many readers as possible. Thus, a content labelling system could be perceived as a financial threat. A balance that involves some sensational news may be required for outlets to remain competitive. A cooperative relationship with news content providers is desirable to facilitate implementation. Ideally, any labelling system that is developed would be one that news providers are happy to incorporate.
Another challenge is identifying what information would be relevant, useful, and accessible to the average consumer. A nutrition facts analogy exists. Some labeling does not relate to health but rather to other purchasing decision factors such as not containing GMO ingredients, which has no scientific consensus with regard to food safety [72].
Even if appropriate information is identified, promotion will likely be required to drive consumer use as both knowledge and motivation will be required, as is the case with food nutrition facts [19]. Once consumer knowledge deficiencies are identified, the development of knowledge and motivation can be driven through similar educational initiatives as have been used for food labeling.
If a social media platform introduced labeling, it may raise questions regarding the platform's neutrality and impair the protections that this provides. In the United States, Section 230 of the 1996 Communications Decency Act states [73] that "no provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider". This protects social media companies from liability for what its users say on the platform. Similar legislation in other nations includes the Defamation Act 2013 in the United Kingdom [74] and Directive 2000/13/EC in the European Union [75]. Not every nation has similar law in place. Political pressure, new legislation, or judicial decisions may cause social media platforms to adjust their labeling policies.
Finally, online content labelling could be perceived by some as being an infringement against Americans' First Amendment rights and similar rights in other countries. This may be a key reason for government restraint, thus far, in regulating online news content.
Beyond legal questions, implementing this type of system presents considerable logistical challenges. In order to be effective, the system must be able to accurately classify most articles at launch. However, automatic classification requires exposure to numerous types of content and feedback. Manual classification is a potential solution; however, this requires a herculean effort to classify all content on the internet. Firms or governments that misstep in this process may be accused of censorship and defamation for articles wrongly classified as deceptive while concurrently being blamed for letting some deceptive content through, in the case of false negatives. If the system is seen as not working well, consumers will not use it, preventing it from learning over time. Those with profit motives related to deceptive content may well try to seed disinformation about the system, using their existing delivery channels, to impair its deployment. Perhaps most problematically, if deceptive content creators figure out how to trick the system it may even lend additional credibility to their illicit content.

Limitations
The work presented herein, including the proposed 'nutrition facts' labeling system and related analysis, draws from the analysis of prior systems of labeling in a variety of domains. Given this, the conclusions that are drawn are inherently only as strong as the correlation of behavior between these domains and online news consumption. Assessing this correlation is a key area of planned future work (which is discussed further in Section 6).
It is also important to note that the spread of fake news is not limited to online interactions with the original source of media. In-person interactions may continue the spread of misinformation in a manner which labeling can neither detect nor prevent. According to cognitive dissonance theory, a consumer may rate information as being more credible when it reaffirms previously held political beliefs. As such, consumers may accept and continue to spread this information without attempting to verify its accuracy. There is no label that can be applied to such conversations happening after the point of initial delivery. Consumers with stronger analytical and critical thinking skills are observed to be more likely to perform the necessary research to verify the accuracy of misinformation even when it does conform with their previously held beliefs [76]. Given this, the proposed labeling system's effectiveness and impact will vary-potentially significantly-based on the types of content that it is applied to and the other interactions (and level of interactions) between members of the public outside of online news consumption.
A labeling system can only be as effective as the public's ability to interpret and apply the information it presents. Educational initiatives will be needed to implement such a system. Just as nutrition facts labeling works in concert with K-12 education in health sciences, a misinformation labeling system must be combined with the supporting knowledge and analytical thinking skills. Current educational standards in the United States may insufficiency prepare the public to shield themselves against pervasive misinformation, as a recent study [77] indicated that at least one-third of respondents had been fooled by a fake news article. Whether educational content can be (and is) created related to content assessment and its effectiveness will impact the efficacy of the proposed labeling system. This, also, could cause the results to vary significantly from those enjoyed by prior labeling systems.

The Need for News Nutrition Facts
Just because making such a system is challenging, however, does not mean that it should not be done. U.S. president John F. Kennedy famously said that "we choose to go to the moon in this decade and do the other things, not because they are easy, but because they are hard, because that goal will serve to organize and measure the best of our energies and skills, because that challenge is one that we are willing to accept, one we are unwilling to postpone, and one which we intend to win" [78]. In fact, dealing with deceptive online content may be a modern-day "moon shot" in that it is critical to society and technologically difficult. The challenge will require the development and implementation of new technologies and the potential mobilization of significant societal resources.
News 'nutrition facts' are one potential partial solution to this problem. The approach benefits from not limiting speech while helping consumers to evaluate the speech that they are consuming. In this regard, the solution maintains the important democratic concept of a free market for ideas while combatting those that seek to manipulate this market by supplying false information packaged as legitimate true content. Just as nutrition facts labeling does not prevent a consumer from purchasing a product-it just informs them about the potential benefits and harms of the product-online content nutrition facts is similarly informative.
With this information, consumers can be more aware of potential issues with content that they are consuming and on guard when consuming content that has characteristics typical of intentionally deceptive content. Most importantly, though, each individual is free to choose what they watch, read and write about. No third party, such as a government agency, is preventing or reviewing speech. In fact, the effective voluntary use of online content nutrition facts may prevent the need for (and implementation of) government regulation, just as was the case in video game regulation.

Conclusions and Future Works
This paper has discussed the need for consumers to have access to additional information when determining what news-related content they want to consume and while consuming it. Consumers are not well equipped to recognize intentionally deceptive content and can, thus, benefit from a system that can provide them with relevant information regarding a page's characteristics or, potentially, even a warning about particularly problematic content.
To this end, this paper has reviewed a number of labeling systems for consumer products that provide examples of prospective approaches that can be taken for an online content labeling system. Potential considerations with different approaches-and with labeling online content in general-have also been discussed.
Future work is necessary to advance this concept to a point where it would be ready for consumer use. Work is needed in three key areas. First, metric values to be included in the consumer disclosure must be identified. These will be needed irrespective of what format the disclosure takes.
Second, determining what format the disclosure will take will be critical. It could prospectively be a nutrition/energy facts-style information panel, which provides information without making a recommendation as to what action a consumer should take. Alternately, it could take a form similar to cigarette warnings, where problematic pages are flagged with a warning. Finally, a combined approach could be designed where, in most cases, the facts panel is presented, but a warning is presented in some extreme cases.
Work is also needed is on the development of a browser plug-in to implement and facilitate the testing of this type of a system. This plug-in could prospectively be configurable to allow it to ingest different content sources, to facilitate user customization and different configurations to comply with consumer desires and applicable regulations in multiple countries. This could also facilitate using different forms of display and different rating information for youth viewers, as opposed to adults.
With regards to these last two topics, additional research on the ways that consumers perceive different types of warnings and the process of identifying, choosing and consuming online content is needed. A survey is planned to collect this data for analysis. One point of particular analysis will be to see if trust in online news media follows the model that Duradoni et al. [7] have shown for online communications. Specifically, they demonstrated that reputation scores impact consumer behavior but that consumers treat those without reputation scores as having a good reputation. Online news media perception will, prospectively, be impacted both by the provider's reputation (if it is well known) as well as by the 'nutrition facts'-style score. Determining whether a similar pattern regarding source ratings holds for online news content will be a key future research topic.
Once the foregoing has been completed, different system configurations will need to be tested to determine which are most effective at informing consumers about deliberately deceptive content. Their impact on the consumer experience and consumer decision making will also need to be assessed.
Beyond this, the role of online news content in society bears consideration. Future work will, thus, also need to focus on how intentionally deceptive online content can be used by individuals, nation state adversaries, political parties, activist groups and others to achieve their desired goals. The foregoing, as well as the benefits enjoyed due to association with influencers, other questions of group dynamics and the impact of content 'nutrition' information will need to be assessed in this broader context.
In the absence of content producers, site operators and others voluntarily implementing informative content labeling, government regulation may be inevitable. Despite valuing free speech, governments may have little choice but to combat misinformation that impairs societal functioning and poses harm to the public. A regulated solution may be more cumbersome and burdensome than voluntary approaches. It may also fail to keep up with technological innovation and pose other difficulties. Given the foregoing, the development of voluntary solutions and solutions that may inform eventual regulation, is key to maintaining the benefits of the internet while fighting the harms posed by deceptive content.

Data Availability Statement:
No new data were created or analyzed in this study. Data sharing is not applicable to this article.
Acknowledgments: This paper updates, extends and builds on a paper [79] titled "Introducing & Evaluating 'Nutrition Facts' for Online Content" presented at the 2020 International Conference on Cyber Security and Protection of Digital Services (Cyber Security). Thanks is also given to two anonymous reviewers that provided suggestions that improved this paper.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
The federal government regulates consumer disclosures on a wide variety of other products and for a myriad of services, which are discussed in this section. Notably, state governments, in some locations, may also have additional labeling requirements. States cannot require companies to replace federal labels with state labels, due to federal preemption doctrine [80]; however, in cases where voluntary federal standards exist these would be overridden by mandatory state standards. The sources of federal regulation range from the FTC Act, which covers most commercial transactions' representations and advertising, to specific regulations for products, such as wool and textiles [81]. Table A1 summarizes products with regulated disclosures in the United States, and the source of the disclosure requirements. Across these regulations, a number of consistent themes and purposes emerge. The first is to provide consumers with standardized information in a consistent form. Heaters, air conditioners, other appliances (shown in Figure A1) and light bulbs (shown in Figure A2) have disclosures that allow consumers to compare products based on standard metrics of power consumption, a common estimated power cost value and a common estimated use per year. Other features, for example, the brightness of lightbulbs, are also described in a standard way for some products. For clothing, material content and cleaning instructions are presented in a common form.
The second theme is that regulations notify consumers of risks. For example, cigarettes and smokeless tobacco have warning labels [82] and clothing that cannot be cleaned using normal methods of washing must notify consumers and include cleaning instructions [83].
The third theme is that the advertising and marketing materials must provide consumers with accurate and complete information. The FTC has issued instructions to social media influencers [84] regarding how to disclose affiliation and compensation. It also requires funeral homes to provide price lists and other information to customers in a standardized way [85].  The government has worked with relevant industries to develop government standards, cooperative standards or private standards. In content rating labeling, two different standards are prominent: MPAA ratings for films and V-Chip ratings for television programming. The two standards, which are summarized in Table A2, are relatively similar. The MPAA ratings were developed by the Motion Picture Association of America (MPAA) [88] while the V-Chip ratings were developed by the Federal Communications Commission (FCC) [89]. Both the MPAA and V-Chip ratings assign an appropriateness level (associated with viewer age) to each program. MPAA ratings include a free text field where the reason for the rating can be explained, for the PG, PG-14, R and NC-17 ratings [21]. V-Chip ratings also include up to four letters (D, L, S and V) which provide additional information about the reason for the rating [22]. The D, L, S and V value meanings are described in Table A3.  The government has worked with relevant industries to develop government standards, cooperative standards or private standards. In content rating labeling, two different standards are prominent: MPAA ratings for films and V-Chip ratings for television programming. The two standards, which are summarized in Table A2, are relatively similar. The MPAA ratings were developed by the Motion Picture Association of America (MPAA) [88] while the V-Chip ratings were developed by the Federal Communications Commission (FCC) [89]. Both the MPAA and V-Chip ratings assign an appropriateness level (associated with viewer age) to each program. MPAA ratings include a free text field where the reason for the rating can be explained, for the PG, PG-14, R and NC-17 ratings [21]. V-Chip ratings also include up to four letters (D, L, S and V) which provide additional information about the reason for the rating [22]. The D, L, S and V value meanings are described in Table A3. The government has worked with relevant industries to develop government standards, cooperative standards or private standards. In content rating labeling, two different standards are prominent: MPAA ratings for films and V-Chip ratings for television programming. The two standards, which are summarized in Table A2, are relatively similar. The MPAA ratings were developed by the Motion Picture Association of America (MPAA) [88] while the V-Chip ratings were developed by the Federal Communications Commission (FCC) [89]. Both the MPAA and V-Chip ratings assign an appropriateness level (associated with viewer age) to each program. MPAA ratings include a free text field where the reason for the rating can be explained, for the PG, PG-14, R and NC-17 ratings [21]. V-Chip ratings also include up to four letters (D, L, S and V) which provide additional information about the reason for the rating [22]. The D, L, S and V value meanings are described in Table A3.

MPAA Rating V-Chip Rating Meaning
N/A TV-Y For the very young, targeted at ages 2-6 N/A TV-Y7 For children aged above 7 N/A TV-Y7-FV TV-Y7 with "fantasy violence that may be more intense or more combative" G TV-G Suitable for "all ages" and "general audiences" PG TV-PG Programs where "parental guidance" is recommended and which may not be appropriate for young children PG-13 TV-14 Programs which may have content inappropriate for children under age 13 or 14 R TV-MA Programs for older audiences, 17 or above NC-17 N/A Children under 17 are not allowed, even with parental supervision

Letter Meaning
D "suggestive dialogue" L "course or crude language" S "sexual situations" V "violence" For video games, the Entertainment Software Rating Board (ESRB) developed a ratings system similar to the MPAA ratings. Each rating has a letter rating (listed in Table A4) and a text description field [90]. There is also a supplemental field to describe interactive parts of the game, such as in-game purchases [90]. ESRB ratings are either assigned by a team of reviewers (for boxed games), based on a survey and a video of game play, or a survey and automatic assignment (for downloadable games) [91]. Table A4. ESRB Ratings.
The government has established a role for itself in ensuring that consumers have accurate information and are warned about dangerous products. This role, in the area of content, has been far more restricted, though movie, TV and game ratings have been developed. Federal labeling and warnings in content areas have focused predominantly on protecting children and ensuring the accuracy of commercial speech.