Human Papillomavirus Infection Is Associated with Decreased Risk of Hepatocellular Carcinoma in Chronic Hepatitis C Patients: Taiwan Nationwide Matched Cohort Study

Simple Summary Previous studies have provided evidence suggesting a link between HCV and HPV-associated head and neck cancers. The epidemiological evidence of the relocated association between HPV and HCV-associated HCC is scarce. In the current study, from a secondary claim-based dataset, HPV infection is not associated with an increased risk of HCC in the CHC population. On the contrary, HPV infection seems to be associated with a lower risk of HCC development among patients with HCV infection. These findings suggest that the mechanism of association between procuring an HPV infection and reducing risk of HCC in the CHC population needs to be studied in detail in the future, with the opportunity to generate an intervention target that could delay the development of HCC. Abstract Background: Hepatitis C virus (HCV) has been shown to be associated with human papillomavirus (HPV)-positive head and neck cancers. However, studies regarding HPV infection and the risk of new-onset hepatocellular carcinoma (HCC) among chronic hepatitis C (CHC) patients are limited. We examined the risk of HCC in CHC patients with or without HPV infection. Methods: In total, 9905 CHC patients from 2000 to 2016 constituted the whole cohort. HPV was defined as being diagnosed after HCV. The CHC cohort with HPV (N = 1981) and age-, sex-, inception point-, comorbidity-, and medication-matched non-HPV (N = 7924) were followed up until HCC, death, or 2018. HCC patients were extracted from the Taiwan Registry for Catastrophic Illness Database. We adopted the propensity score match and inverse probability of treatment weighting (IPTW) to eliminate bias. Cox proportional hazard regression analyses were performed to calculate HCC risk. Results: After a full adjustment, HPV was not associated with HCC risk (aHR, 0.74; 95% CI, 0.58–0.96 in the main model, and aHR, 0.76; 95% CI, 0.66–0.87 in IPTW, respectively). Almost all subgroup analyses verified this finding (HRs < 1.0). Conclusions: Among CHC patients older than 18 years old, those with HPV infection were associated with a lower risk of subsequent HCC.


Introduction
Hepatocellular carcinoma (HCC) is one of the most common cancers of the digestive system, with approximately 500,000 new cases diagnosed worldwide each year [1], and it also accounts for the second highest number of cancer deaths worldwide [2]. Therefore, the discovery of effective methods to prevent HCC is a critical public health issue. Hepatitis C virus (HCV) is a major contributor to the rising incidence of HCC worldwide and accounts for the second leading cause of HCC-related deaths [3]. Although progress in antiviral therapy has led to the dawn of HCV eradication and a reduced risk of HCC, the risk has not been completely eliminated [4]. The cirrhotic population appears to have an increased risk for HCC even after HCV eradication [5].
In ecosystems, there are simultaneous and rich heterogeneous interactions between microorganisms. Epidemiological and experimental analyses have provided quantitative evidence for the existence of subtle mutually beneficial or competitive interactions between respiratory viruses, especially cold and influenza viruses [6]. There are some more examples, such as HPV, which seems to be facilitated by human immunodeficiency virus (HIV) [7], whereas Sta. aureus is negatively associated with Str. Pneumoniae [8]. Pathogen-pathogen co-occurrence can lead to complicated competitive or co-operative forms of interactions.
The association of carcinogenesis between HPV and HCV has been mentioned [9,10]. HCV causes not only HCC, but has also been proposed to be linked to HPV-associated oropharyngeal head and neck malignancies [11]. HCV nonstructural protein 5 B (NS 5 B) recruit HPV oncoproteins E6, leading to the proteasomal degradation of retinoblastoma tumor suppressor protein (Rb) [12,13]. Thinking in the other direction, is there a potential interaction of HPV in the development of HCV-associated HCC? Both basic and epidemiological studies on this topic are scarce. We believe that it would be fruitful to examine the virus-virus interplay on the hard outcome HCC. Hence, the main objective of this retrospective secondary cohort study is to observe whether there is a synergistic or inhibitory effect of HPV infection in hepatocarcinogenesis in the CHC population.

Data Sources
After obtaining approval of institutional review board (IRB) from Taiwan Ministry of Health and Welfare, we identified medical records of all patients aged 20 years and older with hepatitis C infection collected in National Health Insurance Research Database (NHIRD). Taiwan's National Health Insurance (NHI) program administered by the Taiwan government is single-payer and mandatory, and includes >99% of Taiwan's population [14]. NHIRD is available for research purposes with appropriate application [15]. The validity of the NHIRD for use in epidemiological research studies has been shown in previous publication [10,[16][17][18]. This study was a retrospective evaluation of patient information with no more than minimal risk to the subjects. It is impossible to identify individual patients, so informed consent was not required for this study. The encrypting procedure was identical and the linkage of claims belonging to the exact patient was constant and feasible for continuous follow-up. The NHIRD records were comprehensive and ongoing registration and claims information, including participants' characteristics, disease diagnoses, outpatient visits, emergency department utility, and inpatient information, diagnostic, treatment, operation codes, and prescribed medications, were available. All claims could be linked in chronological order to provide a temporal sequence of all health service utilizations. Prior to 2016, the diagnosis codes used in this study were based on the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), and from 2016 onwards, the diagnosis codes were based on ICD-10-CM. This study was ap-

Identification of Study Sample (CHC Population)
The study database contained two million people, which was sampled by Bureau of National Health Insurance (BNHI) from the original claim data of NHIRD. The Health and Welfare Data Science Center (HWDC) of the Ministry of Health and Welfare provided data on the medical records and causes of death of 2 million sampled people to use. The frequently used variables in the NHI data and the cause of death data were provided directly for each application. The distributions of sex, age, and health care cost between the 2 million files and the 23 million population was similar. The National Health Research Institutes (NHRI) assigned a random number for each person by using the Knuth1 and Park and Miller's 2 method [19,20].
There was no significant difference in the sex (p = 0.613) [21] and age distribution between the subset data and the original NHIRD [21]. We identified adult patients (>20 years) who had been newly diagnosed and medically recorded with HCV infection, based on the ICD-9-CM codes 070.41, 070.44, 070.51, 070.54, 070.70, 070.71, or V02.62, from 2000 to 2016 [16]. The diagnosis was verified by medical records review examined by boardcertified gastrointestinal specialist for at least 3 outpatient visits or 1 hospitalization. A total of 10,800 CHCs were eligible for analysis. By this HCV definition, although exact proportion of patients with positive HCV RNA was unknown, the positive and negative predictive value could reach as high as 87% and 99%, respectively, and this was verified by a previous study linking NHIRD to New Taipei City Health Screening Database [22].

Exposure to HPV
Patients infected with HPV were extracted from the database according to ICD-9-CM codes 079. 4 To avoid selection bias, the inclusion criteria required at least three outpatient visits or one inpatient diagnosis in the same database. This definition was used in our previous epidemiological studies [10]. The onset of HPV infection was defined as a diagnosis of HPV in the years 2000-2016, but not in the years 1997-1999. The inception point was the date of the first diagnosis of HPV in the time frame for the study group, and the date was assigned to the matched subjects in the control group. The control group comprised of CHC patients without a medical record of HPV before the inception point. To minimize misclassification, we also examined the control group to confirm that no HPV infection was diagnosed throughout the study period.
Of these patients, the testing of HPV infection was conducted by using a clinical physician's judgement, but not by screening. There was no exact false negative rate of HPV infection. However, it is believed that this happens rarely because HPV testing has a great relationship with NHI payment for subsequent treatment and follow-up.
To increase the homogeneity, we excluded patients who had medical record of HBV infection (ICD-9 codes 070.22, 070.23, 070.32, 070.33, and V02.61), and viral hepatitis of others (ICD-9 codes 573.1, 573.2, and B190). Patients with malignancies (ICD-9-CM codes 140-208) were excluded prior to the index date to avoid potential liver metastases in our enrolled participants. Those that had HCC within 6 months after the inception day were excluded to avoid cases with under-diagnosed HCC. We excluded patients with missing information regarding age or sex.

Potential Confounders
Patients' age and sex were recorded at the inception point. We also collected information about individual monthly income (<15,000 TWD or 540 USD, 15,000-29,999 TWD or 540-1080 USD, ≥30,000 TWD or 1080 USD), and urbanization level of residence area (1 to 4, 1 as the most urbanized and 4 as the least urbanized), a proxy for healthcare availability in Taiwan [23]. Baseline comorbidity was extracted from at least three consistent records

Ascertainment of HCC Diagnoses and Follow-Up
We set new-onset HCC as the primary outcome of the study (ICD-9-CM code 155.0; ICD-10-CM codes C22.0, C22.7, C22.8). Patients with HCC were verified by linking to the Registry for Catastrophic Illness Patient Database (RCIPD). The positive predictive rate was 93% [24]. The study group and a 1:4 ratio of age-, sex-, comorbidity-, medication-, and starting point-matched controls were followed until the occurrence of HCC, emigration, imprisonment, withdrawal from NHI, or the end of the study (31 December 2018), whichever came first.

Matching Method
Baseline characteristics of participants were shown as event number and percentages for categorical variables and as means and standard. We used the propensity score method for analysis by referring to the following article by using the SAS program. This reference may help readers to more clearly understand the detailed statistical analysis procedures than log files of the statistical analyses from SAS [25].
The propensity score was calculated using the probability of the disease status assignment by using a logistic regression model, including the following baseline variables: age, gender, urbanization, individual monthly income (TWD); comorbidities of hypertension, diabetes, hyperlipidemia, chronic kidney disease, peptic ulcer disease, helicobacter pylori infection, COPD, cirrhosis, liver decompensation, alcohol-related illness, and autoimmune disease; every single reported medications of aspirin, metformin, statin, NSAIDs, interferon; index year.
CHC patients with HPV were matched (1:4 ratio) with those who did not have HPV according to their propensity score through nearest neighbor matching, initially to the eighth digit and then as required to the first digit. Therefore, matches were first determined within a caliper width of 0.0000001, and then the caliper width was increased for unmatched cases to 0.1. We reconsidered the matching criteria and performed a rematch (greedy algorithm). For each CHC patient with HPV, the corresponding comparisons were selected based on the nearest propensity score.

Statistical Analyses
Baseline characteristics of participants were shown as event number and percentages for categorical variables and as means and standard deviations for continuous variables. Chi-square test was adopted to examine the categorical variables, and Student's t tests for continuous variables. All of the tests of significance were 2-tailed. p-value < 0.05 was considered statistically significant. The incidence of HCC during follow-up was calculated by dividing the number of events by the respective person years at risk and presented as the number of events per 1000 person years. Incidence rate ratio was assessed by Poisson regression. We plotted Kaplan-Meier curve to describe the cumulative incidence of HCC in the study and control group, and tested the difference between the groups by the log-rank test. We used multivariable Cox proportional hazards analysis to examine the HCC risk associated with HPV after adjustment for age, sex, hypertension, diabetes, hyperlipidemia, chronic kidney disease, peptic ulcer disease, Helicobacter pylori infection [26], COPD, cirrhosis, liver decompensation status, alcohol-related illness, CHC treatment, and medications. To validate the robustness of study findings (main model), four sensitivity analyses (model 2 to 5) were conducted.

Sensitivity Analyses
To address over-fitting issue, we conducted sensitivity analysis (Model 2) by matching for age, sex, index date, low individual monthly income, living area, comorbidity, and medications. Given the insidious nature of cancer, we conducted sensitivity analysis (model 3) by minimizing indolent HCC around the inception point by exclusion of observation period. The first 12 months of observation after the diagnosis of HPV were excluded, eliminating all cases of HCC having occurred within the first 12 months during follow-up. We conducted further sensitivity analysis (model 4 and 5) by IPTW.

Subgroup Analysis
We conducted subgroup analyses to examine the potential interaction of gender, age, alcohol-related illness, cirrhosis, and liver decompensation status among CHC patients. We determined significance of interaction by the likelihood ratio test.

Patient Involvement Data Availability Statement
The data source used in this study was the claims data of NHIRD published by Taiwan National Health Insurance. Participants were not involved in the retrospective secondary cohort study. For Taiwan legal restrictions according to "Personal Information Protection Act", data in this study cannot be made publicly available. Requests for data can be sent as a formal proposal to the NHIRD.

Demographic Characteristics and Comorbidities
In total, 9905 adult CHC patients were enrolled between 2000 and 2016. Among them, we identified the HPV group (N = 1981) and non-HPV control group (N = 7924). We selected controls based on strict criteria of the same age (without age range), same sex, same inception point (HPV diagnosis date as the inception point for case and matched non-HPV), same comorbidities, and medications distribution. Table 1 shows the characteristics of patients in both groups. In the HPV group, 50.8% of the participants were male, and 66.1% of them were between the ages of 20 and 64 years. The mean (SD) age was 57.4 (14.2) years. The most common comorbidities in the study group were peptic ulcer disease, hypertension, and hyperlipidemia. No difference in the use of aspirin, metformin, statin, NSAIDs, and interferon was shown at baseline between groups.

Primary Analysis
In the main model, CHC patients with HPV infection had a lower risk of developing HCC compared with those without HPV infection, 76 and 326 patients having developed HCC events, respectively. The study group had a lower incidence rate (IR) of HCC (IR,  Figure 1 shows the Kaplan-Meier curves, showing that the cumulative incidence of HCC was lower in the HPV group than in the controls (log-rank test, p = 0.02). In Table 2, after a full adjustment for demographics, age, comorbidities, and medication, the aHR of HCC for the study group relative to the controls was 0.74 (95% confidence interval (CI), 0.58-0.96; p < 0.05). The study group's mean time from the inception point to the development of HCC was 6.75 years (standard deviation (SD) 4.15), while in the control group it was 5.88 years (SD 4.05).

Primary Analysis
In the main model, CHC patients with HPV infection had a lower risk of developing HCC compared with those without HPV infection, 76 and 326 patients having developed HCC events, respectively. The study group had a lower incidence rate (IR) of HCC (IR, 5.69/1000 person years) than the controls (IR, 7.26/1000 person years). Figure 1 shows the Kaplan-Meier curves, showing that the cumulative incidence of HCC was lower in the HPV group than in the controls (log-rank test, p = 0.02). In Table 2, after a full adjustment for demographics, age, comorbidities, and medication, the aHR of HCC for the study group relative to the controls was 0.74 (95% confidence interval (CI), 0.58-0.96; p < 0.05). The study group's mean time from the inception point to the development of HCC was 6.75 years (standard deviation (SD) 4.15), while in the control group it was 5.88 years (SD 4.05).     Table 2 also shows the results of the sensitivity analysis (model two). A lower risk of HCC remained in the HPV group compared to the matched controls (aHR, 0.76; 95% CI, 0.59-0.98). Table 3 shows the results of the sensitivity analysis (model three) (i.e., excluding HCC which occurred 12 months within the inception point), where the aHR was 0.74 (95% CI, 0.57-0.97). This model aimed to mitigate underdiagnosed HCC cases around the inception point. Table 3. Overall incidence of HCC (per 1000 person years) and estimated hazard ratios according to disease status and matching status by Cox method (excluding HCC occurrence within 1 years after the index date).  Table 4 shows the results of the sensitivity analysis (models four and five), where the aHR was 0.74 (95% CI, 0.65-0.86) and 0.76 (95% CI, 0.66-0.87), respectively.  Table 5 and Figure 2 illustrate the risks of HCC stratified by the gender, age, and relevant comorbidities in CHC patients with HPV infection compared to controls. No significant interaction was found in all subgroups. CHC patients with HPV infection did not have an increased risk of HCC, including both genders and all age groups. After adjusting for sex, age, comorbidities, and medications, HPV was associated with a reduced risk of HCC in CHC patients with underlying cirrhosis (aHR, 0.41; 95% CI, 0.18-0.90), and in CHC patients with underlying liver decompensation (aHR, 0.26; 95% CI, 0.08-0.85).    Table 6 shows this association stratified by the follow-up time. A significant lower risk of HCC was found in the study group during 1-3 years and 3-5 years after the inception point (aHR, 0.61; 95% CI, 0.37-0.99; aHR, 0.55; 95% CI, 0.31-0.97, respectively).  Table 6 shows this association stratified by the follow-up time. A significant lower risk of HCC was found in the study group during 1-3 years and 3-5 years after the inception point (aHR, 0.61; 95% CI, 0.37-0.99; aHR, 0.55; 95% CI, 0.31-0.97, respectively).

Discussion
The current study using countrywide secondary claim data reported that a prior medically recorded HPV diagnosis was associated with a reduced risk of HCC in CHC participants. We extracted 10,800 CHC patients from an NHIRD sub-database of 200 million participants. In Taiwan, the prevalence of HCV infection among adults aged over 20 years old was about 4.4% [27]. A potential causality could be inferred by the consistent findings of risk reduction in the subgroup analyses. There was a lower risk for both sexes and all age groups which almost achieved statistical significance, although the HCC event number was small. HPV infection was an independent factor with a protective association with the HCC development after an adjustment of the baseline demographics, comorbidities, and medications.
Furthermore, in the CHC population with a baseline status of cirrhosis and liver decompensation, the HPV associated risk reduction in HCC remained. On the contrary, in the subgroup of alcohol-related disease, the negative association between HPV with HCC was obscured. The results of our epidemiological study may have potential implications for the prevention of HCC in the CHC population.
The administration of interferon with ribavirin was the treatment of choice for HCV infection over the past 20 years, which was replaced by direct-acting antiviral (DAA) agents in the last decade. To date, several studies proved the usefulness of HCC prevention by interferon-based therapy in patients with HCV [28,29]. On enrolment, we matched the use of interferon in both groups, which could minimize the influence of HCC, lowering risks through an interferon-induced sustained viral response (SVR). Moreover, DAA agents also had some positive results on HCC prevention [30,31]. However, DAA agents were not covered by the NHI in Taiwan until 2017. Even though some HCV patients may take DAAs on their own financial support and are not registered in the database, by the estimation of the Taiwan Hepatitis C Policy Guideline, this number is small and, thus, the influence could be ignored.
The underlying mechanism to explain this inverse association remains unclear. Some studies have shown that HPV 16 DNA and HPV 18-related nucleotide sequences in HCC specimens [32] and HPV 18 E6 and E7 genes can be integrated into the human hepatomaderived cell line, Hep G2 [33]. One study even provided evidence regarding the significant association between HCV and HPV in candidates for liver transplantation [34]. A recent study [9] discussing the protein interaction network between HPV and human beings revealed that different human proteins have different numbers of interactions with HPV viral proteins. In the study, the KEGG (Kyoto Encyclopedia of Genes and Genomics) pathway analysis of human proteins showed that the gene set of the cell cycle was most actively involved with the interaction concerning HPV viral proteins, followed by viral carcinogenesis, and p53 signaling. HCV was also found to be involved in HPV interactions, but the intensity of its involvement was at the 7th position of all human proteins.
The interaction between HPV and HCV in oncogenesis has been mentioned; in a 2016 case-control study of head and neck cancers, HCV was found to be associated with HPV-positive head and neck cancers, and a possible synergistic oncogenic role of HPV E6 protein in the development of head and neck cancers has been postulated [11]. HCV and HPV seem to share oncogenic pathways in head and neck cancer; however, their interaction on the HCC development remains unclear. The role of HPV in interfering with hepatocarcinogenesis may be related to the involvement of a Toll-like receptor (TLR). Singlenucleotide polymorphisms in TLR genes may be key markers of an early susceptibility to various cancers, including HCC [35]. Since they initiate intracellular signaling pathways to induce antiviral mediators, they have been considered as the first line of antiviral immunity [36]. Studies have shown that TLR4 rs4986790 was significantly negatively associated with HCV infection [35]. Individuals carrying the rs1927911 heterozygous genotype had a significantly reduced risk of HCC [37]. On the other hand, HPV 16/18 infection was shown to be associated with TLR4 rs4986790 and rs1927911 [38], which may somewhat indicate its negative role in the development of HCV-related HCC.
Several rigorous statistical models were adopted in our study. By linking two databases (NHIRD and RCIPD) in Taiwan, the hard outcome of HCC was reliable. We found an inverse association between HPV infection and HCC. Whether there is a casual relationship between being free of HPV infection and HCC development needs further studies to clarify.

Limitations
First of all, although the numbers enrolled in this study were adequate, the HCC event rates were lower in this study. This was more obvious when we performed the subgroup analyses. However, this pilot study still offered a new point of view of the viral interaction and the association between virus and tumor occurrence. A larger-scale cohort study could be conducted in the future.
Second, this was not a prospective study. Although we matched potential confounders between the two study groups, there was still a possibility of observation or selection bias. Fortunately, with a large number of well-validated coding for the exposure and outcome covariates, this study was able to overcome the possible bias and, thus, elucidate a clinically meaningful association.
Third, some patients might have had a recurrent HPV infection during the study period. Our study mimicked an intention-to-treat analysis; therefore, patients with another HPV infection may have been included, and, therefore, the assessment of HPV and HCC risk should be conservative.
Fourth, although both interferon and DAA reduce the risk of HCC [39], DAA was not collected in the Taiwan Insurance Administration at the time of our case recruitment, and interferon use in both groups was matched at baseline, so the bias caused by different drug treatments for HCV was minimal. Moreover, HPV vaccination may alter the immune system of the host. However, because HPV vaccination was not covered by the health insurance plan during the study period, information on HPV vaccination was not available in our database.

Conclusions
In this large-scale, retrospective CHC cohort study in Taiwan, people with a new medical record of HPV infection had a lower risk of HCC compared to CHC patients without HPV infection. Informed Consent Statement: This study was a retrospective evaluation of patient information with no more than minimal risk to the subjects. Since it is impossible to identify individual patients, informed consent was not required for this study.
Data Availability Statement: Data are available from the National Health Insurance Research Database (NHIRD) published by the Taiwan National Health Insurance (NHI) Bureau. Due to legal restrictions imposed by the government of Taiwan in relation to the "Personal Information Protection Act", data cannot be made publicly available. The Longitudinal Health Insurance Database 2000 (LHID2000) was used for this study. There were about 1 million individuals randomly sampled from the Beneficiaries of the National Health Insurance Research Database (NHIRD), that comprised approximately 23.75 million individuals in NHIRD. For details of LHID2000, please visit the website: https://nhird.nhri.org.tw/en/Data_Subsets.html.