3. Results
This section presents the results of our experiment on the collated online news using the DBSCAN algorithm.
Table 1 breaks down the number of clusters as well as their sizes. We found 60 clusters formed by 157 news reports. What is more, 462 news items were classified as single, thus not belonging to any group.
To validate the solutions DBSCAN found, we manually read and analyzed the clusters. As a result, we identified that the news reports featured similar or identical excerpts in each group, which confirms such excerpts originated from a piece of news in common (see an example of a news report extracted from a cluster DBSCAN identified in the
Appendix A).
Table 2 depicts the distribution of news and clusters throughout three time periods: pre-campaign (1 January 2015 to 21 November 2018), during-campaign (22 November 2018 to 31 March 2019), and post-campaign (1 April 2019 to 31 December 2019). The pre-campaign period encompassed 164 news reports (26.49% of total news reports) and 18 clusters (11.46% of total clusters); the during-campaign period, 85 news reports (13.73% of total) and 24 clusters (15.29% of total), and lastly, the post-campaign period, 370 news reports (59.77% of total) and 115 clusters (73.25% of total).
Figure 3 provides the results of the replications observed, where each dot indicates the publication of a news item and the line represents the time interval between similar news. Each color represents a different cluster, and the dashed vertical lines mark off the campaign periods.
Regarding the during-campaign period, we verified that the proportion of replications in this period was 15.29%, a modest growth compared with the pre-campaign period. Moreover, by evaluating the replications that occurred post-campaign, one will observe a 73.25% increase, which indicates the campaign actions encouraged an organic diffusion of syphilis-related information.
Another issue deserving of investigation is that of single news (not grouped into clusters), i.e., news items that do belong to any cluster. Considering the exact scenarios for such news, we found that 26.49% of news was produced before the campaign, 13.73% during the campaign, and 59.77%, post-campaign. When evaluated from a managerial perspective, these results demonstrate that single news items are highly significant since they address different facets of syphilis.
From this viewpoint, by using time series decomposition analysis, we processed the data and plotted it into a trend graph to verify the conversion point in the news trend. According to
Figure 4, there was a steady publication tendency of syphilis-related news in Brazil from September 2015 to February 2018. However, after March 2018 (the first dashed vertical line), it is possible to observe a sloping line that accentuates during the campaign (November 2018 to March 2019) and remains increasing by the end of the observed period, with no fall. Before the SNP, syphilis campaigns were integrated into HIV or other sexually transmitted infections (STIs) campaigns, focusing on SIP or CS, without much investment in resources and different types of media [
22]. Adopting a distinct approach, with greater resources investment and media diversity, the 2018-2019 campaign highlighted collective aspects without attributing blame to individuals, promoting more diverse representations of target audiences [
23].
The graph in
Figure 4 displays an inflection point in the trend change that started in May 2018, which is justified because that month was effectively the month of the beginning of SNP activities across Brazil, subsuming several seminars and inter-federative actions [
3]. This drew the local media’s attention and mobilized communication efforts among state and municipal health secretaries in Brazil.
Considering this perspective, it is reasonable to infer that the pre-campaign period, with 164 news items, had a low level of replications (11.46%), thus representing an earlier and more stable time frame. Nationwide diffusion of the SNP communication campaign, starting at the end of the third quarter of 2018, is considered a new initial incentive in this analysis. In addition, the latency period of the setting of the syphilis issue in the intermedia agenda is characterized by the during-campaign period, accounting for 85 news items and 13.73% of replications. Thus, the time span we refer to as "post-campaign," with 370 news items and 73.25% of replications, is marked by the period of correspondence between the setting of the topic on the Brazilian digital media agendas, where the syphilis issue was set in numerous other news outlets during the period of sustained coverage over time.
4. Discussion
The results of the clustering of syphilis online news during the three periods of the SNP campaigns corroborate the understanding of the dynamics and enhancement of public health interventions in the field of communication. There is solid evidence that combining mass media with other communication strategies is highly effective [
6].
Such an effect can be observed in our results as the rise in syphilis online news is still noticeable even after the campaign’s end. Therefore, using techniques and methods applied to digital health technologies based on machine learning, such as those used in this article, becomes essential for measuring the impacts of communication agendas in public health. This aspect is critical in contexts of public health crises, such as the syphilis epidemic in Brazil, due to the urgency in measuring the effectiveness of public health response communication actions.
Notably, 74.64% (n = 462) of the news analyzed by the algorithms developed for this study were not grouped into clusters. Not belonging to a cluster implies that the news item is “single”, i.e., it was not replicated. We investigated this phenomenon for all the news items in the corpus with this hallmark. In this context, we observed that 100% of these news stories were effectively material of an authorial nature produced by the journalistic institutions or organizations themselves, with specific and qualitative approaches to the different themes of the disease. That is an interesting scientific finding of our study, for it validates the technique applied in this research method. For the area of management, which is concerned with mass communication in public health, the existence of algorithms with this distinction capacity is relevant to evaluating the quality of journalistic texts on health care.
All 462 news items identified as “single” were news/special reports from national or regional media, in addition to editorials, news pieces published by universities, NGOs, and health organizations, or op-ed articles. On the website (digital address) in which single news stories were published (
Table 3), we identified a nearly proportional distribution between national (n = 68) and regional (n = 86) corporate media and original materials posted by the MoH and state and municipal health secretariats (162) or health or societal organizations or educational institutions (146). Of note, the most significant amount of news stories in the national and regional press outnumbered publications by government and societal institutions.
As for the journalistic genre [
24] (
Table 4), the set of 462 single news primarily includes news items (n = 295), i.e., information based on recent events or updates about already well-known facts. Next, we have the opinion articles (n = 73) representing the singular voice of journalists, experts, or renowned figures who express their opinions about certain aspects of the syphilis problem. The news reports (n = 71) identified are texts that, in addition to presenting syphilis-related facts, provide a more in-depth editorial treatment and a plurality of voices and data, aiming to contextualize, situate, or present readers with perspectives on the infection. Informative materials were classified under the category "Other." These include podcasts, murals, reports, glossaries, and entries that present significant data on syphilis but lack the temporality and factual emergency that characterize news stories and reports.
Considering the media typification of the 462 single news items as paid media, spontaneous media, or organic media [
25] (
Table 5), it is possible to verify that the volume of spontaneous news disseminated (n = 282) is nearly two times higher than that of organic media (n = 172, state-owned media run by MoH and SNP, added to the owned media run by autarchies, state and municipal health secretariats. This reflects the campaign’s effectiveness in sparking voluntary interest from the Brazilian business media, which has incorporated the syphilis issue into its agenda, thus strengthening the tendency of the press to transfer its interests to society’s agenda [
13].
Figure 5 displays the agenda-setting effects between media (intermedia), media and organizations, and organizations (inter-organization). There were seven clusters from one national media to another (clusters 14, 16, 48, 53, 54, 56, and 59), whereas cluster 16 starts in a public organization and then sets the media. Additionally, four intermedia agenda-setting effects were wielded among regional news media coverage (clusters 24, 32, 43, 50).
Figure 5 also evidences one case of agenda-setting effect between a national media organization and a private communication organization (cluster 28) and another between a national media and a public organization (cluster 52). From the regional media to a public organization, one agenda-setting process was identified (cluster 46). The cluster "0" makes the interesting movement of agenda-setting from one organization to another (inter-organization) that leads to publication in national media. And two other agenda-setting were from public organizations directly to the national media (clusters 2, 45). Seven agenda-setting were carried out from public organizations to regional media (clusters 4, 20, 21, 31, 36, 40, 44). Lastly, the agenda-setting between organizations running news pages was observed in five clusters (30, 33, 34, 37, 51).
Thus far, there are 29 clusters wherein agenda-setting effects include spontaneous news, i.e., produced without the action or agency of the organizations responsible for the communication campaigns of the SNP. The news reports published on webpages of organizations that wielded agenda-setting effects within other organizations (inter-organization) in an organic manner—meaning that such organizations are directly associated with the MoH or the SNP (state and municipal health secretariats)—comprised 13 clusters in total (clusters 8, 9, 12, 27, 35, 38, 39, 41, 42, 47, 55, 57, 58). In relation to the agenda-setting of the syphilis issue involving organizations, we also identified the “news recycling” phenomena in 12 clusters, where texts discussing the disease are reposted in different media spaces or updated by these organizations. Eight of these clusters were identified through spontaneous news (clusters 1, 5, 4, 6, 7, 10, 11, 19), and the other three were in organizations linked to the project or the MoH in an organic way (clusters 3, 13, 22). One of them was a result of paid media whose news was reutilized in the website of the institution responsible for the SNP, that is, the Laboratory for Technological Innovation in Health (LAIS) at the Federal University of Rio Grande do Norte (UFRN).
It is worth noting that, in cluster 14, the intermedia agenda-setting effects occur between journalistic companies within the same group. In Cluster 53, we identified a news report published on 05 November 2019 by the newspaper of national reach “Folha de São Paulo”. Such a piece was replicated on the website of the Regional Council of Medicine of Pernambuco on 06 November 2019 and by the newspaper “O Tempo”, from Belo Horizonte, Minas Gerais, on 07 November 2019. The latter case demonstrates such effects [
12,
13], whereby a news story conveyed in one of Brazil’s top newspapers was replicated two days later in a regional newspaper in the state of Minas Gerais.
Another significant piece of data we spotted in the clusters characterized by the intermedia agenda-setting process shows that all the 11 clustered processes occurred in 2018 (cluster 14: national media) and 2019 (24, 32, 43, 50: regional media; 16, 48, 53, 54, 56, 59: national media), years during which the SNP developed its communication campaigns.
A relevant fact can be observed in cluster 16, wherein the agenda-setting process stems from a public organization towards a national media and, subsequently, is transformed into an intermedia agenda-setting effect. Regarding cluster 16, the news report from RADIS magazine was reproduced twice by FIOCRUZ, its organization, through its online media sites (i.e., Portal Fiocruz and Portal ENSP), and it was featured on BBC News 10 months later. Such a publication led two reporters to investigate the subject, find new sources, interview them, and produce original material using the same structural approach and angle. In journalistic jargon, this phenomenon is called a “hook”, the anchor of a news report. Shortly after that, besides publishing it on its website, BBC News also sold it to UOL and Globo group, which reproduced it on their portals, quoting the initial source in the credits.
Still, regarding the spontaneous news generated by the media, we identified the agenda-setting effect (i) in a news space of a private organization (cluster 28); (ii) from national media on a public organization (cluster 52), and (iii) from regional media on a public organization (cluster 46). Conversely, we perceived an inter-organizational agenda-setting effect between two public organizations that wielded an influence on national media (cluster 0).
In the case of cluster 0, the original news report explaining “what is syphilis”, published on 9 May 2016 on the Health Secretariat of São Paulo website, was replicated eight times. The first was 16 months later, on 21 October 2017, when the Health Secretariat of Bahia and the Health Surveillance Department of Santa Catarina updated the news report. The year following, on 24 February 2019, this department updated the publication content. Then, on 29 April 2019, it was the turn of the news website of the Federal University of Paraíba (UFPB) to add information about SIP and CS. About four months later, on Aug. 16, 2019, the MoH website recaps the same information in a new section, “Health from A to Z”, with a focus close to the UFPB website, highlighting the care of SIP and CS infections. After two months, during the period dedicated to the SNP campaign’s actions, the text was updated thrice: on October 10, October 21, and 27 October 2019, on the news website of the MoH Virtual Library, the “Catraca Livre”, and MoH’s official website. This was a news report that provides all the epidemiological features of syphilis. Therefore, a news report with organic characteristics generated by the governmental agenda, which disseminated through news spaces of governmental instances focused on health. Interestingly, it was also published on an alternative news website, namely Catraca Livre, characterizing the publication as spontaneous media. The spontaneous media, i.e., the broadcasting of news about syphilis without the onus or initiative of those responsible for the public communication campaign, is indicative that the topic has achieved relevance in the media agenda and influenced the public’s agenda.
Moreover, news stories that resulted from the agenda-setting influence of a public organization on national media (clusters 2 and 45) can be considered spontaneous media, as can be the case when agenda-setting effects by public organizations were echoed in regional media (clusters 20, 21, 31, 36, 40, 44, 49). The inter-organizational agenda-setting effect between organizations (health, societal, or educational institutions) that disseminate the syphilis topic on their portals and websites was present as spontaneous news in five clusters (30, 33, 34, 37, 51). The recycling of spontaneous news was mapped out in eight clusters (1, 5, 4, 6, 7, 10, 11, 19).
In an organic format, thirteen clusters were characterized considering the news stories that exerted the inter-organizational agenda-setting effect among organizations directly related to the MoH, State, and Municipal Health Secretariats (clusters 8, 9, 12, 27, 35, 38, 39, 41, 42, 47, 55, 57, 58). Another three clusters resulted from recycled organic news (clusters 3, 13, and 22). Additionally, the agenda-setting arising from the republication of the news posted on the LAIS/UFRN website, paid for by the 2019 communication campaign of the SNP, rendered a recycled news item aired in the Metrópoles newspaper (Cluster 17).
Of note, the Google search engine returned, on different dates, identical URLs that were grouped into five clusters (15,18, 23, 26, 29). To some extent, this indicates that the news repeatedly appeared at different moments, thus contributing to sustaining the topic’s prominence. Further, cluster 25 presented two web pages containing a table with the historical series of SIP and CS in the city of São Paulo, pages that experts incorrectly categorized. On the other hand, given the number of news reports evaluated (from 1049 to 619), this discrepancy does not significantly differ from the results.
The effects of agenda-setting on the media, for instance, regarding issues like syphilis and the possible transfer of salience to the public’s agenda, is not immediate and take place in the medium-term [
12]. Thus, there is a “latency” period between the stimulus (appearance of a topic on the media agenda) and the repercussion/response on the public’s agenda (relevance transfer process) [
13]. This response or repercussion on the public’s agenda can be measured, for example, through comments on social media, the number of searches for specific keywords (e.g., syphilis) on digital search engines, and academic research initiatives that may have been motivated by reading published news about the disease.
Based on the “intermedia agenda-setting” theory [
13], this study predicts and provides an approximate temporal range related to the communication strategy of an organized and structured public health campaign. It provides a mapping that can be used as a model and transfer element for planning future communication processes in public health crises. Simultaneously, it reveals the importance and power of journalistic works, which provide a source and can be replicated by other media outlets.
Therefore, the present work poses new research questions from a computational and communication perspective. Questions such as “what are the narrative elements themselves?” and “what formats can enhance and replicate communicative events in a way that can produce solid clusters on the public media agenda?” are highly relevant questions for future health campaign blueprints.
Pinto et al. [
2] present a multidimensional exploratory analysis of the syphilis epidemic in Brazil. These authors analyze the growing interest of the Brazilian population in the subject on search engines and increased media coverage for the respective period. They also observed the positive impact of the SNP, reporting the rise in population syphilis test uptake and the drop in cases, reversing the growth trend of syphilis after nearly a decade.
Moreover, in [
26], the same group of researchers performed a systematic literature review on approaches for evaluating public health campaigns. They concluded that such campaigns should be assessed from multiple perspectives to grasp the overall reach of mass communication. In addition, they observed a shortcoming in the impact evaluation of public health campaigns in terms of online data usage (online news and other user-generated online content). Further, these scholars highlighted some overlooked topics in the existing literature, e.g., the analysis of publications and research projects on the subject.
Hence, the results of the syphilis online news clusters for three periods of the SNP campaign suggest it encouraged the transfer of salience of the syphilis issue from the media to the public’s agenda, with an increase in the replication of news items. In conclusion, the sum of the during-campaign and post-campaign periods reflects the success of the agenda-setting process as a time of sustained media coverage, with an increase in replications, in addition to the consolidation of the intermedia agenda-setting, with an 87.3% correspondence of replications over time. It is noteworthy that, given the epidemic scenario of syphilis in Brazil [
27], developing a sustainable mass communication agenda in health and demonstrating evidence of its sustainability is relevant for conducting public health policy, especially in crisis scenarios.