4.3. Results
Study respondents were individuals who received and clicked on links in an email, as described in
Section 3.1. The only qualifications required of survey participants were to provide consent to participate and to be over 18 years of age. As respondents could choose to not answer any question, and stop taking the survey at any time, not all questions were answered by all respondents.
University A had 47 respondents that completed the first survey (which is defined as completing questions beyond the consent question). University B had 59 respondents complete the first survey. While most respondents answered all or numerous questions, one University B response was discounted due to having only a single question answered beyond the consent question. Key demographic information for respondents, such as their age, income level, and education level, is presented in
Table 1.
Analysis indicates that the political alignments of both the article sponsor and publisher have one of the strongest effects on the personal perceptions of respondents from University A; 37.2% of respondents stated these categories affect their perceptions of the trustworthiness of news a great deal and 58.1% indicated that they affect their perceptions either a lot or a great deal. The publisher and sponsor had the second highest impact at the “a great deal” level, with 34% of respondents indicating this response for each. Notably, the publisher (74.5%), author political alignment (67.4%) and quality (61.9%) had the highest levels of respondents rating them as either “a great deal” or “a lot”. Sponsors, publisher political alignment, sponsor political alignment, reading level and tech statements all had at least 50% of respondents from University A indicating their importance as either “a great deal” or “a lot”.
Regarding sponsorship, most respondents (57.4%) feel that sponsorship had at least “a lot” of an effect on their own perception of trustworthiness, yet only 19.6% feel sponsorship affected most people’s trustworthiness “a lot” or more. Further, 61.7% of respondents believe that when acting in an ideal manner, sponsorship should have at least “a lot” of an effect on the trustworthiness or credibility of an article. In this instance, it appears that participants believe they both are and should be affected by this metric, despite others not being. Given this, including this metric on labels would seem to be beneficial.
In contrast, most respondents feel than an article’s virality has an effect on other people (68.3% of respondents indicated this being “a lot” or “a great deal” of impact for others); yet fewer (33.3%) indicated that it had “a lot” or “a great deal” of impact on their own personal beliefs. This is especially interesting since virality is indicated to ideally have “a lot” or “a great deal” of impact by only 14.3% of respondents (tying with controversy for least ideally important). This result shows a stark contrast between how individuals perceive their own beliefs, others’ beliefs, and ideal beliefs. However, this result does not necessarily indicate the metric should be completely ignored.
Notably, for many metrics, the gap between the number of respondents indicating their own beliefs of an item having “a lot” or “a great deal” of impact and others’ similar belief is much higher than the gap between the respondent’s beliefs and ideal beliefs.
Figure 5a,
Figure 6a and
Figure 7a present all of the data for University A.
Analysis of this similarly indicates that political alignment of both the article sponsor and publisher along with the actual sponsor itself have the strongest effect on the personal perceptions of respondents from University B; 44.1% of respondents consider the publisher to have “a great deal of” impact on their personal perception of trustworthiness and credibility of an article. Over three-quarters of University B respondents (78.0%) indicated that the publisher would have “a lot” or “a great deal” of impact. This was somewhat higher than for the impact anticipated for others (23.7% indicated “a great deal” of impact and 52.5% indicated either “a lot” or “a great deal” of impact) and the ideal impact (for which University B respondents indicated that 33.9% thought it should have “a great deal” of impact and 27.1% said it should “a lot” of impact).
For University B, after the publisher’s identity, the publisher’s political orientation (73.1% “a lot” or “a great deal” of impact), the quality (67.3% “a lot” or “a great deal” of impact) and the author’s political orientation (59.6% “a lot” or “a great deal” of impact) were indicated as having the most impact on individuals. The author’s political orientation (73.1% “a lot” or “a great deal” of impact), publisher’s political orientation (63.5% “a lot” or “a great deal” of impact), and controversy level (65.2% “a lot” or “a great deal” of impact) were identified as being the most impactful in others. The quality (65.4% “a lot” or “a great deal” of impact), the publisher (61.0% “a lot” or “a great deal” of impact), and the sponsors (50.8% “a lot” or “a great deal” of impact) were identified as the metrics that, ideally, would be the most impactful.
Figure 5b,
Figure 6b and
Figure 7b present all of the data for University B.
Given that most respondents indicate that it should not have a high impact on perception (14.3% at University A and 12.8% at University B indicate an ideal “lot” or “great deal” of impact) but believe that it does impact most other people (68.3% at University A and 61.7% at University B indicate “a lot” or “a great deal” of impact on others), virality requires significant additional analysis in future work. If respondents’ perceptions of others are accurate, the low ideal score does not necessarily mean the metric should not be included. A key question that will need to be answered is whether this perception of others is accurate and, if so, whether it perhaps actually demonstrates a negative correlation between credibility and trustworthiness and the metric itself.
Generally, the data can be analyzed in terms of the indicated values for self, other, and ideal for each metric (and on a per school basis).
Table 2 presents the relevant interpretations. For example, if all three have high levels of indication, this can be taken as indicating that respondents value the metric and believe that they should. If none of the three has a high level, this can be taken as respondents not valuing the metric and believing that to be appropriate.
The data presented in
Figure 5,
Figure 6 and
Figure 7 are now analyzed in terms of whether more than 50% of respondents indicated valuing a metric at the “a lot” or “a great deal” level. This analysis is presented in
Table 3. By juxtaposing
Table 2 and
Table 3, the interpretation of each metric for each university is readily apparent. For example, the title metric falls under the category of respondents valuing the metric, but this not being ideal, for both schools. Publisher, on the other hand, is valued by the respondents, believed to be valued by others and seen as ideal to be valued by over 50% of respondents at both schools. This indicates that respondents value the publisher metric and believe they should.
Due to the differences between the two schools (and the demographics of the regions they are located in), comparing the perception of the metrics between the two is informative. A comparison of
Figure 5,
Figure 6 and
Figure 7 shows differences between the perceptions at University A and University B in some areas, and minor fluctuations in others. One of the most notable differences is in the perception of the sponsor’s political alignment, with approximately 60% of respondents at University A indicating its personal importance and only approximately 40% at university B indicating its importance (in both cases at either the “a lot” or “a great deal” levels). Notably, the same patterns between different metrics are largely reflected in the data from both universities.
To facilitate the comparison of the relationships between actual perceived perceptions and ideal perceptions, an integer value between 0 and 4 is applied to the responses in each category (a great deal, a lot, a moderate amount, a little, or none at all) where 0 represents “none at all” and 4 represents “a great deal”. The mean response value is then compared.
Figure 8a,b present these results for Universities A and B, respectively. In seven of the categories, the comparative ranks of respondent self-important, important to others, and ideal important indications are the same between the two schools. In author and sponsors, the ideal and self-importance are close, but oppositely ranked. Publisher political alignment, sponsor political alignment, reading level, and technical statements have more pronounced differences.
Sponsors is one interesting area to review, as respondents indicate their personal perceptions of the effect by sponsorship matches the ideal level of effectiveness quite closely for University A and much closer than to others’ perceptions for University B. In both cases, there is a significant gap between respondents’ perceptions of ideal importance and how they perceive others’ perceptions of importance. A similar pattern of others’ perceptions being identified as highly as individuals’ own and ideal perceptions is present with regard to the publisher, date, author, sponsors and quality metrics at both schools. A similar pattern gap for the reading level metric is present at University A, but is not present at University B.
Large differences in virality and controversy metrics between others’ perceptions and ideal perceptions (and others’ perceptions and self-perceptions) are present in the data from both schools. In both cases, the others’ importance value notably exceeds the self- and ideal importance values. Thus, this analysis indicates other areas of prospective further study for label metric understanding. It also further confirms the many similarities between the patterns in the data, despite the demographic differences between the two schools.
In addition to comparing the data between the two universities (and the associated demographic differences that this comparison includes), the data can also be analyzed in terms of the age of the respondents. Thus, the data are also compared between two age groups: individuals aged between 18 and 29 and those 30 years or greater of age. These data are presented in
Figure 9,
Figure 10 and
Figure 11. This comparison is of particular interest as the 18–29 year olds are a group that Helsper and Eynon [
59] term “second-generation digital natives”. These individuals are differentiated from older groups by their “familiarity and immersion in this new, Web 2.0, digital world”. While there are likely demonstrable differences within subgroups within the older group, insufficient respondents in these groups exist to conduct analysis beyond the difference between the “second-generation digital natives” and others.
While there are many similarities between the two groups, the patterns within the data are not as well aligned as when comparing between the two schools. Given this, it would seem that age (where the two schools’ data were relatively similar) may be an important indicator as to the importance of metrics. In fact, age may have a more pronounced impact than many (or even most) of the demographic differences discussed in
Section 3.2, particularly for self-perception and the perception of others.
Figure 12 illustrates this comparison, with
Figure 12a showing the differences (using the point method used in
Figure 8) between University A and University B and
Figure 12b showing the differences between the 18–29 and 30+ age groups.
As shown in the figures, the difference between the two schools is smaller than the difference between the two age groups in 8 of the 13 self-perception categories (all except date, author, sponsor political alignment, virality, and technical statements). The difference between the two schools is smaller than the difference between the two age groups in 11 of the 13 others’ perception categories (all except publisher’s political alignment and virality). Finally, and interestingly, the difference between the two schools is only smaller than between the two age groups in 4 of the 13 categories for the ideal perceptions. Thus, while the differences are smaller in approximately 60% of cases between the two schools, the question of why the ideal perception differs from the other two remains. A key topic for prospective future work will, thus, be to investigate why the three different types (self, other, and ideal) show such notable differences.
Table 4 presents the same data, for the age groups, that
Table 3 presented for the two schools. These values can be juxtaposed with
Table 2 to identify the relevant interpretation of the combination of self, other, and ideal perceptions for each metric. Note that while there were only five conditions in which the two universities differed, there are 12 in which the two age groups differ and would thus have a different interpretation for.
There are also large differences as to which metrics
Table 3 and
Table 4 show the two groups having agreement with regard to. There are eight conditions where both
Table 3 and
Table 4 have both groups indicating over 50% of respondents believe the metric has “a lot” or “a great deal” of impact. However, there are six conditions where both groups indicated over 50% impact on one table, but not on the other. Title—other, Author—self, author political—other, publisher political—other, virality—other, and controversy—other are the areas of difference.
Overall, respondents indicated that their own self perceptions of effect importance are not exceedingly different from the ideal perceptions of effect importance. At University A, the average level of difference between self and ideal perceptions was 0.43; it was 0.44 at University B. However, at University A a greater discrepancy exists between the average level of difference between the self-perception and perception of others (0.56). This is not the case at University B, where this value is 0.47—just slightly higher than the self-versus-ideal comparison.
Depending on whether the difference is positive or negative, respondents are indicating that they feel others are affected more than ideal (positive) or not affected enough (negative). In most cases, respondents feel they are affected more than they should be (in 10 of 13 cases at University A and 12 of 13 cases at University B).
On the other hand, respondents felt that others were not affected enough. In 7 out of the 13 categories at University A, and at 8 of the 13 categories at University B, respondents indicated that the effect on others’ perceptions fell short of the ideal level. Additionally, there was no instance, at either school, where respondents indicated feeling that they are not affected enough yet believed that others were affected more than enough.
Of particular interest are the cases of publisher and quality (for both schools), reading level for University A, and author and sponsors for University B. In each of these cases, respondents reported the belief that they utilize this metric more than what is ideal while others utilize this metric less than what is ideal. This suggests that respondents feel they not only out-perform others in terms of this metric, but also that respondents believe themselves to be, if anything, too vigilant regarding these metrics. This may be problematic, as it could indicate that initiatives to educate potential label system users will be met with resistance as they may not feel they are part of ‘the problem’. This appears to be another example of the third-person effect.
Given this analysis, it appears that some metrics will likely be beneficial to include in a news labeling system while others may not be. However, it also appears that there may be some instances where additional education could prove to be helpful in improving the utility of various metrics to users.
Table 5 and
Table 6 characterize the metrics in terms of whether they are perceived to be underutilized, overutilized or appropriately utilized. Metrics with an “L,” indicating less-than-ideal usage of the metric, may be easy to convince users as to the benefits of. Those perceived as having the right level of utilization may not require a change from the present status. Finally, those where less utilization may be ideal may either be metrics to avoid or metrics where an education campaign is required to inform users about the benefits and efficacy of the metric.
In
Table 5 and
Table 6, no metric differs by more than one step on the continuum from ‘too much’ to ‘right amount’ to ‘too little’ when comparing age groups or universities. The largest difference is with regard to the title metric, where most respondents believe it is used too much by everyone, but respondents aged 30 and over believe it is used the right amount by everyone. The second largest difference is in perceptions regarding reading level and sponsor political alignment. Respondents from University A, as well as younger respondents, believe they use reading level too much. Others believe they use it the right amount themselves. Interestingly, on the metric of sponsor political alignment, it is University A and the older group that are in agreement that the metric is used too much by respondents themselves. Other respondents believe they use this metric the right amount. There is agreement that others use both reading level and sponsor political alignment metrics the right amount.
The metrics with the least difference amongst these groups are date, sponsors, author political alignment, publisher political alignment, and quality. There is agreement that date, sponsors, and quality are used the right amount by respondents themselves and too little by most people. There is also agreement that author political alignment and publisher political alignment are used too much by both respondents and others.
The remaining metrics only differed by one step and for only one group. Each group had an instance of differing from the others. Most respondents believe the publisher metric is used too much by the respondent and the right amount by others. Respondents aged 30 and over, though, believe that it is used too little by others. Most respondents believe that the author metric is used the right amount by respondents themselves and too little by others. Respondents from University A, though, believed it is also used too little by respondents themselves. Most respondents believe the virality metric is used too much by both respondents and others. Respondents from University B, though, indicated believing that they use it the right amount themselves. Most respondents believe the controversy metric is used too much by both respondents and others. Respondents aged 30 and over, however, indicated believing that respondents use it the right amount themselves. There is agreement that the technical statements metric is used too much by the respondents themselves. Most respondents also believe others use this metric too much; however, University A respondents indicated believing that others use it the right amount.
In all cases where there was a single disagreeing group, the disagreement was that the metric had a lower impact than observed by other groups. This difference was between the ‘right amount’ and ‘too little’ with respect to ‘other people’ in one case and with respect to ‘self’ in one case. The difference was between ’too much’ and ’the right amount’ with respect to ‘other people’ in one case and with respect to ‘self’ in two cases.
As such, while there was a trend for disagreement downward on the axis from ‘too much’ to ‘too little,’ there was no correlation with disagreement relating specifically to ‘self’ or to ‘others.’ This shows that the disagreements were not simply self-interested. The age 30 and older group was the most frequent to disagree, disagreeing three times and indicating with equal frequency that they or most others used a metric less than what was perceived by other groups. Group A was the sole disagreeing group twice, once with a differing self-perception and once with a different perception about others. Group B was the sole disagreeing group once, believing themselves to be impacted by the viral metric the ‘right amount’ rather than ‘too much.’
A potential bias where survey respondents were less likely to see themselves as doing ‘too little,’ but were more likely to see themselves as doing ‘too much’ is observed. Similar to the earlier observation, a biased belief that the survey-taker does ‘too much’ rather than ‘too little’ could complicate obtaining buy-in from individuals using educational initiatives. With only one exception (University A on the author metric), none of the four groups demonstrated the perception that they used a metric ‘too little.’ Perceptions about other people were much more likely to lean towards ‘too little,’ as this appeared as the second-most frequent observation about ‘others’ for each of the four groups in
Table 5 and
Table 6. This also appears to be an example of the third-person effect.
Further research will be needed to identify how the metrics perform and to separate undesirable metrics to use from the metrics that would benefit from educational initiatives and become effective. The virality and controversy levels, given the discussion above, are metrics that will need to be investigated further. They may be areas where educational initiatives could be effective, as well. This label study, thus, has answered several key questions; however, it has also raised a number of new ones which will serve as key areas of prospective analysis for future work