Diverging from News Media: An Exploratory Study on the Changing Dynamics between Media and Public Attention on Cancer in China from 2011–2020

Over the past decade, China has witnessed fast-paced technological advancements in the media industry, as well as major shifts in the health agenda portrayed in the media. Therefore, a key starting point when discussing health communication lies in whether media attention and public attention towards health issues are structurally aligned, and to what extent the news media guides public attention. Based on data mined from 73,060 sets of the Baidu Search Index and Media Index on 20 terms covering different types of cancer from 2011 to 2020, the Granger test demonstrates that, in the last decade, public attention and media attention towards cancer in China has gone through two distinct phases. During the first phase, 2011–2015, Chinese news media still held the key in transferring the salience of issues on most cancer types to the public. In the second phase, from 2016–2020, public attention towards cancer has gradually diverged from media coverage, mirroring the imbalance and mismatch between the demand of active public and the supply of cancer information from news media. This study provides an overview of the dynamic transition on cancer issues in China over a ten-year span, along with descriptive results on public and media attention towards specific cancer types.


Introduction
Cancer universally ranks as a leading cause of death and is a crucial barrier to increasing life expectancy [1]. In 2020, the estimated number of new cancer cases was 19.29 million globally, and approximately one out of every four new cancer patients appear in China [2]. This life-threatening and increasingly prevalent disease has been a highly explored topic when it comes to the cross-field of cancer and health-information seeking behavior (HISB) research [3][4][5][6].
HISB is widely defined as all the ways in which individuals go about obtaining health information [7], from which some sub-concepts have been derived, e.g., internet healthinformation seeking behavior (IHISB) and online health-information seeking behavior (OHISB). Generally, self-reported questionnaires or interviews, and data mined from search engines and other websites are the main two data sources used in HISB studies. Web data mining is now increasingly adopted over self-report methods, because of its high efficiency, comprehensiveness, and objectivity, which address the major drawbacks of the latter [8].
It should be noted that there have been arguments on what constitutes a sign of public attention. Data from surveys with traditional deployment methods were once accepted means to assess public attention and opinions, however over the years it has become evident that such data fails to capture the elusive dynamics of public attention [9]. On the other hand, Eysenbach posits that log data from search engines "allow valuable insights into information needs and human behavior" and "can be meaningful inferences made on the presumed intention of the user " [10]. Experts in other fields have also provided support for this idea with empirical studies [11,12]. As Web-service-generated data represent a gradually increasing non-negligible instantiation of public attention, more and more scholars in health, policy and politics, media and communication, as well as other disciplines across the social and natural sciences, are using online search data to quantify public attention [13][14][15].
In recent years, the cancer-related agenda has changed within Chinese media. According to the reversed agenda-setting hypothesis [16,17], the public is no longer a passive party but are in fact so-called active users, independent of media agenda-setting [9]. So it follows to ask whether the news media can still lead public attention as it did in the era of mass communication [18,19]. Public attention on cancer not only promotes health consciousness and reduces people's risk of disease, but has also been proven to be a significant predictor of social support [20,21], with great relevance to personal health and social harmony in the long run. Thus far, however, there has been little research on this topic in China, a country with the world's largest cancer population. Baidu, owing to its position as the largest Chinese search engine globally, offers an effective platform for exploring cancer-information seeking behavior (CISB) in China.
Based on data mined on the Baidu Search Index and Media Index of 20 cancer types from 2011 to 2020, the study articulates the distribution of public and media attention towards different cancer categories, and clarifies the lead-lag patterns, namely the causal linkage, between media reports and public seeking behavior for cancer information in China. The remainder of this article is structured as follows: Section 2 explains the data source, data acquisition method, and the overarching research design; Section 3 presents statistical outcomes, including descriptive statistics and Granger causality results; The conclusion and discussion that are central in the research, have been given separately in Sections 4 and 5.

Materials and Methods
This study employed a set of quantitative methods ( Figure 1). Descriptive statistics and correlation tests were performed to determine how Chinese media, as well as Chinese netizens, have allocated attention in terms of specific cancers in the last decade, reflected by search engine data and China's latest cancer registry data. Then the Granger causality test was adopted apropos of the causal inference between time series [22]. To avoid spurious regression, the augmented Dickey-Fuller (ADF) test was used first, and nonstationary variables were processed by taking the difference so as to ensure stationary data [23]. Furthermore, the vector auto regression (VAR) was combined to determine the lag specification according to the information criterion [24]. EViews 11.0 was used for the Granger causality test alongside the ADF test and VAR modeling, coupled with SPSS 25.0 for the correlation coefficient and the coefficient of variance (CV) calculations.
The data used in the analysis were the Baidu Search Index and the Media index from Baidu, a Chinese tech company analogous to Google. From 2011 to 2020, Baidu accounted for the bulk of the domestic search engine market share and has been empirically proven to be the most commonly-used tool for seeking cancer-related information among Chinese netizens [25][26][27]. Furthermore, the search engine embraces a kaleidoscope of electronic news content from both state-owned and pro-business Chinese media, providing a window into the dynamic flow of media attention towards cancer. To be specific, the Baidu Media Index is based on the quantity of news reports from major internet media, related to keywords and concurrently collected by Baidu News Channel, whereas the Baidu Search Index takes keywords as statistical objects and calculates the public search volume for a certain keyword.
This study crawled the Baidu Search Index and Media Index of the 20 cancers terms listed above from 2011 to 2020 using Python code. This process confirmed that all the cancers listed had corresponding index data in Baidu. In total, 73,060 sets of index data were collected and the date of data collection was 5 January 2021.
It is important to note that, considering the potential effect of fraudulent traffic generated by web bots and the filter bubble, we have done a certain amount of work. Firstly, we collected and checked Baidu's statements on the bot traffic and their Robots Exclusion Protocol, confirming that there are systemic anti-crawler strategies in Baidu, and especially, that Baidu has formulated strict rules against those cheating methods of data fraud, managing to maintain the fairness and impartiality of all indices data. Secondly, to examine the existence of cancer-specific filter bubbles, we designed a pre-test involving Baidu's users from several places in China. This additional investigation revealed that, no matter whether participants have previous searching experience and pre-existing digital traces or not, there was no obvious difference both in their search results and news feeds on keywords for specified cancer types.

Results
The distribution of the allocation of the public and media attention towards 20 cancer types by years has been mapped out in Figure 2. As is shown, the patterns of the two lines are completely different. The Baidu Search Index of the 20 kinds of cancer has maintained a long-term rising trend but in 2020, the year of the COVID-19 pandemic, declined to levels similar to that of the year 2015, while the Baidu Media Index has been hovering at a low level since a precipitous drop in 2015.
2 Figure 2. Yearly distribution of the two indices over ten years.
As Table 1 displays, the average Baidu Search Index of leukemia was the highest among the 20 cancers (M = 4734.769), and conversely, the lowest was head and neck cancer (M = 31.224). According to the latest data, leukemia was not even the top 10 most common cancers in terms of crude incidence rates and mortality rates [29,30], but the one most concerned by Chinese netizens. In the past decade, the public's searching activities have been relatively steady on the account that CV for CISB is less than 1, except for head and neck cancer and breast cancer. Breast cancer, with the highest maximum value 313,389, has undergone the most drastic change in public attention during the last decade. The Baidu Search Index for breast cancer reached its peak on 16 January 2015, the day that Bella Yao, a famous Chinese female singer, died of breast cancer. The incident itself, along with a succession of debates on media ethics, has gone down in the history of Chinese health communication [31][32][33].
According to the Spearman correlation test, the value of the Spearman's Rho between the Baidu Media Index and the number of cancer cases ( = 0.556, p = 0.013 < 0.05) is higher than that of between the Baidu Media Index and the number of cancer deaths ( = 0.472, p = 0.041 < 0.05).
Briefly looking at the means of the Baidu Media Index (Table 2), we can conclude that the media attention on different cancer types was thoroughly out of balance between 2011-2020. Leukemia attracted extremely disproportionate media attention (M = 48.771), while head and neck cancer was still the least exposed subject of reports (M = 0.001), which perfectly corresponds to the situation of the Baidu Search Index. As for whether it was the more fatal cancer or prevalent cancer that drew more public attention, the findings were similar to that of the Baidu Media Index, where the relationship between the Baidu Search Index and the number of cancer cases is much stronger ( = 0.472, p = 0.041 < 0.05). Whereas there is no statistically significant correlation between the Baidu Search Index and the number of cancer deaths (p = 0.198 > 0.05).
The Granger causality tests (Table 3) indicate causation between the Baidu Search Index and Media Index of the 20 cancers. Furthermore, the causal relationship of certain cancer types, both one-way and two-way, have been tested and found to occur at least once a year on a decennial scale. Of the 20 cancers searched, stomach cancer was the only type for which the causal linkage between the two indices had never been broken. The Granger test showed that the Baidu Media Index of stomach cancer caused its Baidu Search Index for the most part, with the exceptions in 2013, 2017, 2019 (mutual causation) and 2020 (one-way Granger-causality from the Baidu Search Index to the Baidu Media Index). The second coherent link exists between the indices of Leukemia, which only discontinued in 2020.
It is demonstrated in Figure 3 that there were two phases, marked by the year 2015. The first phase was supportive of a media-generated agenda, and the second phase exhibiting divergent attention. The media had once, from 2012 to 2015 in particular, dominated the agenda-setting in China, deeply influencing the public agenda on cancer topics, and meanwhile, public attention also produced a sustained but very weak effect on what was discussed on the media newsrooms. As cancer news moved into the second half of the decade, especially in 2020, the agenda-setting ability of the media seems to have gradually taken a downturn. The causal linkage (one-way Granger-causality from the Baidu Media Index to the Baidu Search Index) was no longer significantly obvious, and mutually independent connections, or divergence between the Baidu Search Index and Baidu Media Index, emerged. To address the shortcomings of the Granger causality test, a supplemental CV calculation was performed to locate the strong variation in public attention, and then to determine if the cancer-related media events aroused public interest on the basis of the extent of variability in the Baidu Media Index. By calculating the CV by year, it is found in Figure 4 that in the past decade, compared with the Baidu Media Index's dramatic shift, public attention to cancer has not fluctuated much, and the high variation (CV > 1) of the Baidu Search Index occurred six times; including pancreatic cancer in 2011, skin cancer in 2012, non-Hodgkin lymphoma, bladder cancer, and breast cancer in 2015, as well as head and neck cancer in 2020.  Among them, four fluctuations of the Baidu Search Index were related to the cancer diagnosis and death of celebrities (see Table 4), three of which reached their peak in the Baidu Media Index a few days before or just in sync with peaks in the Baidu Search index, perfectly predicting the steep increase in searching behavior. The same is also true in cases of cancer-related film screenings. However, such conditions have become rare since 2015, coinciding with the preceding result that 2015 marks the transition in the trend of the attentiveness from the public and news media.

Discussion
The news industry was previously regarded as the first port of call for valuable information and stories. Nowadays, with the aid of increasingly advanced technology, we all are ushering in a new era where the entry barrier to news reporting has been lowered, and anyone can be active in the generating and spreading of media content.
The field of health communication is no exception: Revolutionary strides in communication have profoundly influenced public health perception and behavior. News media, once thought of as the most trusted source of health information for the public in the past, is now confronted with numerous new challenges regarding access to health information. Hence, it is a concerning question, for both government departments and health organizations, whether news media can still perform the key role in reaching out to the public and leading their awareness on the health agenda-because if the answer is no, transformation and innovation in health advocacy and promotion should be formulated and adopted accordingly.
Our research begins with how the Chinese media and public have allocated their attention towards different types of cancer in the last ten years, and one of our primary findings is a descriptive overview of the current status. In the distribution of attention, both the media and the public tended to be concerned with cancers of higher incidence. The reason why the media exhibited such a preference is fully understandable as an effective media strategy, which should, after all, cover more people and strive for impact on a larger scale. However, the correlation coefficient of the Baidu Search Index did not reflect the relationship between the CISB and cancer-related death toll. This seemingly goes against our common sense and previous assumption, as the fear of cancer, which has a strong possibility of leading to the CISB [34], emanates from a core fear of mortality of such a sickness unto death [35]. With regard to the different cancer types, leukemia, with unimpressive morbidity and mortality, has been the top focus of the Chinese public and media over the previous decade, with the least attention directed to head and neck cancer, a general term for a variety of cancers.
What sets media attention apart is its much higher variation, where the CV of the Baidu Media Index of ten years and inter years mostly exceeds 1. Though the disparity in coverage may be reasonable in a practical sense, inasmuch as the time, resources, and manpower of news organizations are limited, such practice may have negative effects on health behavior and decision-making, considering the fact that an estimated millions of people suffer from these types of cancers, such as oesophageal and thyroid cancer, the incidence rates of which respectively ranked sixth and eighth in China [29,30].
Although previous studies revealed a marginally significant positive relationship between public attentiveness and journalistic pieces about cancer [18], few studies concentrate on the causal linkage in between. With Granger causality tests, ample evidence from the search engine addresses the question.
With the Granger tests, our findings more or less revisited the viewpoint of Russell et al. that there also exists "an interaction and differentiated resonance" among cancer attention in the first five years of the last decade [9]. The indices also indicate a diverging gap between the attention of the public and media, specifically in the year 2015, dividing the cancer-related communication into two phases. In the first phase, 2011 to 2015, the media's strong agenda-setting function was highlighted by the fact that the Baidu Media Index of at least half of 20 cancer types Granger-caused the Baidu Search index, consistent with traditional agenda-setting theory which posits that the truncated versions of the world presented by the news media are a primary source of people's perceptions of public affairs [36]. However, in the second phase, the media's agenda-setting capabilities did not function nearly as well when their agenda was of little relevance to the concerns of the public, neither addressing nor catering to the demand for cancer information among the population over the last five years. This was especially the case in 2020 when the gap widening was accelerating, and the causal linkages between the attention from both the public and media were substantially fractured, as cancer information-seeking was quite likely heightened when under the conditions of an epidemic [37,38].
With recent technological advances, and the emergence of new media empowering users, it is no surprise that the news media's agenda-setting ability has become a thing of the past and is no longer a leading indicator of public awareness regarding topics like cancer. But there was an unexpected yet thought-provoking result: reversed agenda-setting has not become mainstream and is still unconventional as Maxwell McCombs's inference goes [39].
However, we cannot totally negate the agenda-setting function of news media on cancer, as the yearly CV calculation reflected the ability of the media to set the agenda on certain types of cancer-related issues. For example, the apparent fluctuations in informationseeking behavior were regularly caused by celebrity news and media events, i.e., (1) a Chinese Hong Kong actor Nicholas Tse's diagnosis of skin cancer in 2012, (2) a Chinese singer Bella Yao's death of breast cancer in 2015, (3) the screening of a film work Go Away Mr. Tumour with a storyline about Non-Hodgkin Lymphoma in 2015, and (4) a Chinese politician Xu's death of bladder cancer in 2015, all of which were in agreement with previous findings that celebrity health disclosures and events can encourage the HISB [40,41]. Looking into a variety of news media consumed, the public showed more solicitude for the cancer coverage in entertainment and social news versus cancer in health and science reports.
Our study had some limitations. Firstly, though it revealed a divergence between the public attention and the news media agenda, there was no evidence to determine whether the public attention flowed to social media or somewhere else, which requires further discussion and research. Secondly, the cancer registry data of the China National Cancer Center was the most up-to-date cross-sectional data, but not a dynamic time series data based on the past decade. Thirdly, although we did not observe that Baidu's cancer information feeds vary from person to person in the supplementary investigation, Baidu did develop the function of personalized recommendation, which can trap cancer information harvesters with probabilities in filter bubbles created by the search engines. The potential effects of filter bubbles can also strengthen or weaken the seekers' activities for specific cancer information, leaving our conclusion questionable and future studies intriguing. Finally, Granger causality tests reflect the causation statistically, rather than in the philosophical sense, and additional research is needed for the exploration of causal linkage.

Conclusions
There was a much more distinct divergence between the public attention and media attention toward the cancer-related agenda in China, reflecting a mismatch between public cancer-related information demand and news media content generation, an imperative issue to tackle for effective public health communication. Causes of the divergence remain to be further discussed. Generally, we are inclined to regard it as a manifestation of the new media empowering users-users generate, disseminate, and consume health information without relying on the news agency of centralization. Then inconsistencies between public attention and media attention have been spontaneously pervading the information ecology. Yet we also cannot, solely through this study, single out the role of the elusive filter bubble of platforms and the collapsing public trust in news media, both of which, after all, can entail the fixation and shift of the public eye. We, thus, invite more in-depth thinking on attributions to this divergence.
But one thing that is certain is that there has been an existing divergence between the public and media in the past decade. There are substantial grounds to believe that COVID-19 is disrupting, and will continue to disrupt, cancer information acquisition where urgent access to medical recommendations has switched to telehealth [42]. Considering the public's active demand for various types of cancer information and the Chinese news media's less prominent position in setting the cancer agenda, media with the function of personalized agenda-setting, for instance, we-media and algorithmic media, should be given greater importance in the context of public health communication for cancer as a halting post-pandemic transition has already become the foregone conclusion. Data Availability Statement: Data sharing is applicable to this article on request from the corresponding author.