Impact of country-specific EQ-5 D-3 L tariffs on the economic value of systemic therapies used in the treatment of metastatic pancreatic cancer

Background Previous Canadian cost-effectiveness analyses in cancer based on the EQ-5D-3L (EuroQoL, Rotterdam, Netherlands) have commonly used U.K. or U.S. tariffs because the Canadian equivalent only just recently became available. The implications of using non-Canadian tariffs to inform decision-making are unclear. We aimed to reevaluate an earlier cost-effectiveness analysis of therapies for metastatic pancreatic cancer (originally performed using U.S. tariffs) with tariffs from Canada and various other countries to determine the impact of using non-countryspecific tariffs. Methods We used tariffs from Canada, the United States, the United Kingdom, Denmark, France, Germany, Japan, the Netherlands, and Spain to derive EQ-5D-3L utilities for the 10 health states in the pancreatic cancer model. Quality-adjusted life years (qalys) and incremental cost-effectiveness ratios (icers) were generated, and probabilistic sensitivity analyses (psas) were performed. Results Canadian utilities are generally lower than the corresponding U.S. utilities and higher than those for the United Kingdom. Compared with the Canadian-valued scenarios, U.S. and U.K. estimates were statistically different for 3 and 9 scenarios respectively. Overall, 35% of the non-Canadian utilities (28 of 80) were significantly different, clinically, from the Canadian values. Canadian qalys were 6% lower than those for the United States and 6% higher than those for the United Kingdom. When comparing the qalys of each treatment with those of gemcitabine alone, the average percent change was +6.8% for a U.S. scenario and –7.5% for a U.K. scenario compared with a Canadian scenario. Consequently, Canadian icers were approximately 5.4% greater than those for the United States and 8.6% lower than those for the United Kingdom. Based on the psas and compared with the Canadian threshold value, the minimum willingness-to-pay threshold at which the combination chemotherapy regimen of gemcitabine–capecitabine is the most cost-effective is $5,239 less than in the United States and $11,986 more than in the United Kingdom. Conclusions The use of non-country-specific tariffs leads to significant differences in the derived utilities, icers, and psa results. Past Canadian EQ-5D-3L–based cost-effectiveness analyses and related funding decisions might need to be re-visited using Canadian tariffs.


INTRODUCTION
In light of increasing health care costs, administrators and governments often consider cost-effectiveness analyses to be essential tools for making health care decisions and allocating resources 1 .The analyses compare the costs and outcomes of current medical interventions with alternative strategies, which can include combination therapy, the use of novel therapeutic agents, or discontinuation of treatment 2 .The desired outcome or effectiveness of a treatment is considered to be an increase in the quality of life or the life expectancy of a patient (or both), commonly pooled as a single measurement of quality-adjusted life years (qalys).Calculating the cost difference between the two interventions and dividing by the difference in qalys yields the incremental cost-effectiveness ratio (icer), the major outcome reported for such analyses 3 .Authorities such as the U.K. National Institute for Health and Clinical Excellence 4 and the Canadian Agency for Drugs and Technologies in Health 5 generally accept the use of qalys.Cost-utility assessments in the field of oncology also frequently use qalys 6 .
Health utilities are often measured using instruments such as the Health Utilities Index (Health Utilities, Hamilton, ON) and the EQ-5D-3L (EuroQoL, Rotterdam, Netherlands) 7,8 .Those tools rank utility on a scale of 0-1 (1 representing the best possible health state and 0 representing death), with the possibility of states worse than death (that is, less than 0) 9 .
The EQ-5D-3L in particular is regularly applied in health utility studies of cancer, and it is the endorsed instrument of the U.K. National Institute for Health and Clinical Excellence 10 .This multi-attribute utility instrument in the form of a patient-reported questionnaire represents a single health state by documenting 5 domains of health (mobility, self-care, usual activities, pain and discomfort, and anxiety and depression).Each domain is ranked on a scale of 1-3, representing none, some, or extreme problems in that area.Thus, 243 health states are possible.
To account for the loss of individual patient preferences, valuation studies have been used to create countryspecific tariffs for the conversion of EQ-5D-3L scores into utilities 11 .Valuation studies are generally conducted as large population-based surveys that use time trade-off techniques to map a selected subset of EQ-5D-3L states to a scoring algorithm.The scoring algorithm establishes a statistical relationship between the utilities and the EQ-5D-3L scores to generate an index of country-specific tariffs 12 , with the U.K. index being the most established.If a country-specific index is not available, the U.K. index is often used [13][14][15] .However, a number of studies have demonstrated that using a non-country-specific tariff is inappropriate because of variations in methodologies and cultures.
Concerns with methodologies include the fact that, in a few studies, survey populations differed, as did the variables included in the final modelling equation.For example, certain valuation studies surveyed the general public 16,17 ; others surveyed patients 18 or caregivers 19 (or both).The variables included in modelling equations provide numeric values that account for the loss of utility that occurs when a dimension is not given a rating of 1.For example, the model might account for a mobility rating of 2 by subtracting 0.05 from the overall utility of the health state.A study that compared the modelling methodologies of several EQ-5D-3L valuation studies suggests that, although certain countries (Denmark, Germany, and the Netherlands, for example) generated similar, but higher, preference scores than those generated for the United Kingdom, others (such as the Spanish model) showed significantly more variance.These country-specific differences in utilities were also observed by other authors 20,21 .
Differences in age, sex, standard of living, and preferences by health state in the general population of various countries can potentially contribute to cultural differences in a model 22,23 .For example, a French valuation study found that French respondents weighted problems in the mobility, self-care, and usual activity parameters more heavily than did U.K. respondents 24 .A comparison of standardized EQ-5D-3L data across countries found that prior living standards (per-capita gross domestic product) correlated the most with the EQ visual analogue scale scores (Spearman rank correlation: 0.58).Linear regression analysis showed that gross domestic product explained 67% of the EQ visual analogue scale (p = 0.02) when outliers were excluded and 29% when the outliers were included; per-capita health expenditure explained 26% of the mean visual analogue scale (p = 0.03).Finally, in a preliminary study of the relationships between national culture and EQ-5D value sets, power distance (that is, a society in which people expect and desire inequalities among themselves) and individualism were found to have moderate and strong correlations with pain and discomfort, and anxiety and depression 22 .
Differences in methodology and culture can lead to differences in qaly calculations and cost-effectiveness analyses.Karlsson et al. 25 found that applying various national EQ-5D-3L tariffs to the same data can result in substantially different incremental qaly estimates.In their study, U.K. values were approximately 1.5 times those calculated using U.S. or Danish tariffs.One cost-effectiveness analysis conducted in the United States (n = 301) that compared treatments for Parkinson disease showed that the icer calculated using U.S.-specific tariffs was higher than the icer calculated using U.K. tariffs ($108,498/qaly vs. $42,989/qaly) 26 .
Canada has a predominantly public health system, and so it is crucial that its limited resources are properly allocated.Hospital expenditures for cancer drugs are high; in 2009 alone, provincial cancer agencies were estimated to have spent $800 million 27 .Cost-effectiveness analyses can be used to influence funding decisions and to limit wasteful spending on cost-ineffective treatments.
Many Canadian cost-effectiveness analyses in cancer have been conducted using the utility scores derived from the U.K. tariffs [28][29][30][31] and, to a lesser extent, the U.S. tariffs 32 obtained from the EQ-5D-3L.The implications of this practice of using non-Canadian tariffs have not been determined, and the Canadian Agency for Drugs and Technologies in Health has not yet made an official recommendation on this topic.The previously mentioned studies conducted in the United States and Denmark suggest the possibility that differences in country-specific tariffs could lead to differences in the associated cost-effectiveness results, which might in turn alter reimbursement decisions.It is important to note that no currently published studies have delineated the effect that the use of Canada-specific tariffs (compared with tariffs from other countries, especially the United Kingdom and the United States) would have on cost-effectiveness analyses.In any given evaluation, adopting the appropriate tariffs is therefore crucial if the aim is to ensure that appropriate country-specific decisions are made.
A Canadian valuation of EQ-5D-3L tariffs was recently published 17 , and its findings suggest that a review of cost-effectiveness analyses based in Canada might be necessary.
In the present study, we set out to determine whether the use of country-specific tariffs produces biased results in cost-effectiveness analyses.As an example, we used an earlier analysis that aimed to assess the cost-effectiveness of various therapies for metastatic pancreatic cancer from a Canadian perspective.That study was originally performed using U.S.-based preference tariffs; in the present analysis, it was updated to use alternative country-specific tariffs 32 .

METHODS
We recently published a cost-effectiveness analysis for systemic therapies in pancreatic cancer from a Canadian public payer perspective 32 .In brief, a Markov analytic decision model was used to analyze a hypothetical cohort of patients with metastatic pancreatic cancer treated with one of four chemotherapy regimens.In the analysis, gemcitabine (gem) was compared with three other combination therapies: gemcitabine-capecitabine (gem-cap), gemcitabine-erlotinib (gem-erl) and oxaliplatin, irinotecan, 5-fluorouracil, and folinic acid (folfirinox).Full treatment details, adverse events, and resource utilization can be found in the original publication.
The original study created 8 scenarios describing the symptoms of patients undergoing one of the four chemotherapy treatments.The symptoms included nausea and vomiting, diarrhea, hand-foot syndrome, stomatitis, febrile neutropenia, fatigue, rash, and neutropenia.One additional scenario was used to describe stable disease, and another, to designate a clinical scenario with only basic supportive care.For the original study, those scenarios were sent in the form of a survey to 60 medical oncologists in Canada who were requested to use the EQ-5D-3L to report their perception of the patient's health state.The oncologists chosen were all experts in noncolorectal gastrointestinal cancers and were familiar with the relevant treatments and side effects.Additional questions to ascertain respondent demographics were included in the survey, and the demographics of the oncologists who responded are presented in Table i here.More information on methods are provided in the original publication 32 .
The EQ-5D-3L survey responses were converted into utility scores using each of the country-specific tariffs.These countries and regions were used: the United Kingdom 15 , the United States 16 , Canada 17 , Denmark 33 , France 24 , Germany 34 , Japan 35 , the Netherlands 36 , and Spain 37 .Each country's original valuation study developed its own mathematical model for determining country-specific tariffs.Those tariffs were used to score the EQ-5D-3L results obtained in our survey, generating a set of country-specific utilities for each individual responder and for each scenario.The utilities were then averaged to obtain the mean country-specific utilities for each scenario.Table ii lists the model types according to each country's valuation study.For each scenario, the utility scores thus generated were compared with the Canadian equivalents using paired two-tailed t-tests; differences were considered significant at p < 0.05.To account for multiple comparisons, the Bonferroni correction was used.Furthermore, to determine the effects of utility differences on clinical practice, one-sample one-sided t-tests were performed on each of the 10 scenarios using the differences in the country-specific utilities compared with the Canadian utilities.The null hypothesis was that the absolute difference in utilities was equal to or less than 0.06, the minimally important difference (mid) based on the literature 38 .
Cost-effectiveness was determined as Canadian dollars per qaly for each of the country-specific utilities.The icers were generated using gem alone as the base case.Probabilistic sensitivity analysis was performed for 10,000 simulations to determine the cost-effectiveness and net monetary benefit of each regimen over a range of willingness-to-pay (wtp) thresholds, expressed as e × wtp -c, where e is effectiveness, wtp is willingness to pay, and c is cost.

RESULTS
Figure 1 shows the utility values obtained from the survey when Canadian, U.S., and U.K. tariffs were used, with Canadian values being the common comparator.Correlations of the Canadian scores with those from the United States and United Kingdom were high: the R 2 using the Pearson product was 0.9361 and 0.9460 respectively.In general, U.S. utilities were greater than the corresponding Canadian ones; U.K. utilities were smaller.Table iii summarizes the results of the EQ-5D-3L survey for all countries.Japan generally had the highest utility scores overall (7 of 10 scenarios); the Spanish and U.K. scores were generally the lowest (4 and 3 of 10 scenarios respectively).When comparing the utilities based on Canadian tariffs with the utilities based on tariffs from other countries for the 10 scenarios, statistical differences were observed for 10 utility values derived using Japan-based tariffs, 9 using U.K.-and Netherlands-based tariffs, 8 using Spain-based tariffs, 7 using France-based tariffs, 6 using Denmark-based tariffs, 3 using U.S.-based tariffs, and 2 using Germany-based tariffs.In terms of the mid, none of the differences between the Canada-and U.S.-derived utility scores were significant.On the other hand, 60% of the differences between Canada-and U.K.-derived utility scores were significantly greater than the mid, suggesting that differences in the U.K. model might lead to clinically significant differences in utilities.After the U.S.-derived utilities, Germany-derived utilities are the most similar to Canadian utilities.Of the utility differences, 35% were statistically significantly greater than the mid.
Table iv shows the results of the cost-effectiveness analysis in terms of qalys.No difference in the order of treatment efficacy was observed: folfirinox was consistently the most effective, followed by gem-erl, gem-cap, and gem alone.Variation between countries in regimen qalys was observed, ranging from a minimum difference of 0 to a maximum difference of 0.184.The U.S.-derived qalys were 6% greater on average than the Canadian ones; U.K.-derived qalys were 6% lower.Compared with gem alone, the changes in qalys for gem-cap, gem-erl, and folfirinox were 0.048, 0.076, and 0.214 when Canada-derived, and 0.052, 0.081, and 0.226 when U.S.-derived.Furthermore, when compared with Canadian-derived qalys, the average change over all three treatments was 6.8% higher for U.S.-derived qalys and 7.5% lower for U.K.-derived qalys.Interestingly, although only 1 German scenario resulted in a significant utility difference larger than the mid (Table iii), the expected qalys for Germany were the most different when compared with the Canadian ones (>20% different on average).The qalys based on Danish tariffs were the most similar to the Canadian qalys.
Table v shows the results of the cost-effectiveness analysis in terms of icers, with results, as expected, following the same trend shown by the qaly values.Using Canadian tariffs and comparing combination treatment with gem alone, the icers were $84,475 for gem-cap, $155,459 for gem-erl, and $130,670 for folfirinox.The equivalent icers generated using U.S. tariffs were approximately 5.4% lower; the use of U.K. tariffs led to an 8.6% average increase in the icers.The largest differences between country-specific icers compared with Canadian ones was 17% (German tariffs).The icers generated using tariffs from Denmark were the most similar to the Canadian icers (2% difference on average).Cost-effectiveness acceptability curves for U.K., U.S., and Canadian values (Figure 2) visually show the differences in treatment acceptability.Table vi shows the probability of a treatment regimen being cost-effective at specified wtp thresholds.At a low wtp ($50,000), gem had the highest probability of being cost-effective for all countries.At a wtp of $75,000, gem remained the most cost-effective treatment for most of the countries (ranging from 60.57% for the United States to 82.12% for the United Kingdom); however, at that threshold, gem-cap is more effective when German or Japanese tariffs are used.When the wtp is $100,000, all country-specific tariffs result in treatment with gem-cap being the most likely to be cost-effective.However, probabilities range from 57.11% (United Kingdom) to 92.76% (Japan).Finally, at higher wtp thresholds (>$150,000), folfirinox is the treatment most likely to be cost-effective using any of the country indices.Figure 3 graphically summarizes the probabilities for U.K., Canadian, and U.S. valuations.
The wtp range at which gem-cap is the most costeffective treatment also varied depending on countryspecific tariffs (Table vi).Below the lower limits of the ranges, gem is the most cost-effective, but folfirinox is the most cost-effective above the upper limits.At no wtp threshold is gem-erl cost-effective-a result that is consistent for all country valuations.The largest difference in the thresholds between which gem-cap becomes the treatment most likely to be cost-effective is found when comparing German and U.K. indices, with the German threshold being $25,361 lower than that of the United Kingdom.For folfirinox, the largest difference is between Germany and France, with Germany's threshold being $32,185 less than that of France.

DISCUSSION AND CONCLUSIONS
The present study used Canadian-based tariffs derived from responses on the EQ-5D-3L tool to compare the results of a cost-effectiveness analysis of first-line systemic therapies in metastatic pancreatic cancer with results generated using tariffs from the United States, the United Kingdom, and various other countries.Comparisons of the country-specific indices showed significant differences between the valuations of the derived utility scores.Those differences potentially have interesting clinical applications.According to a study by Pickard et al. 38 , the minimally important difference in EQ-5D-3L scores in cancer is estimated to be 0.06.Thus, we found that 35% of the differences in EQ-5D-3L scores from Canada and from other countries could be considered clinically different, providing evidence that using country-specific EQ-5D-3L Boldface type indicates values that are significantly larger than a minimally important difference of 0.06 (p < 0.000625) 38 .
tariffs is important in ensuring that clinically meaningful differences are captured.The differences we uncovered had an observable effect on the results of the Markov modelling simulation.
Compared with our original analysis (which used U.S. tariffs), the analysis using Canadian tariffs resulted in qalys that were 6.8% lower for interventions compared with standard gem.Tariffs from the United Kingdom,    which were often used as a surrogate index before Canadian tariffs were available, resulted in an expected change in qalys that was, on average, 7.5% less than those calculated using Canadian tariffs.Those qaly differences contributed to variations in the associated icers and probabilities of the highest net monetary benefit at specific wtp thresholds in the final analysis.According to a survey of U.S. and Canadian oncologists, a common wtp range is $50,000-$100,000 39 .Between those thresholds, gem is most likely to provide the highest net monetary benefit; however, the probability of that outcome is 64% in Canada, 58% in the United States, and 75% in the United Kingdom.At those intermediate wtp thresholds, the probability of treatments being the most cost-effective varies the most between countries and could have an effect on funding decisions.
Our study illustrates the effect of using non-countryspecific tariffs on the utilities and results in a cost-effectiveness analysis in a pancreatic cancer model.Differences in country-specific EQ-5D-3L tariffs might be a result of methodologic variations between modelling studies.In obtaining EQ-5D-3L results, some authors have surveyed different populations (for example, a general population vs. patients and caregivers), leading to potential alternative perceptions of health.Most valuation studies attempt to fit multiple models to their results so as to increase the R 2 , leading to disparities in the number of variables included in the final models.For example, the N3 model used by most countries and the D1 model used in the United States could result in variations that are partly responsible for the observed differences in utilities.Sampling variability in each valuation study can also contribute to those differences.
Regardless of the source of the differences in tariffs between countries, the conventional way of applying the tariffs is to use the point estimate of each health state to calculate utilities, which ignores the sampling variation (or uncertainty) associated with the point estimate.However, because only one valuation study is usually performed in each country, there is no way to adequately determine if the failure to account for variability is the cause of differences in the utilities.To counteract that difficulty, Xie et al. 12 created a checklist for reporting valuation studies ("create") that minimizes the heterogeneity caused by factors other than population health preferences.Items include the rationale for choosing the target population and the representativeness of the respondents with respect to the target population, among others.Future valuation studies should be standardized to implement create, and more research into the causes of the differences in modelling valuations should be conducted.One limitation of the present study is that, instead of patients, medical oncologists were surveyed to generate the EQ-5D-3L states for each hypothetical scenario.Some of the literature 40 suggests that it might be reasonable to use health care workers as surrogates for patients in obtaining health utilities, but this approach is controversial, and further research on the topic is needed.However, our original analysis used both one-way deterministic sensitivity analyses and probabilistic sensitivity analyses to examine the uncertainty with respect to utilities, finding that the results were robust to utility variability 32 .Furthermore, because the same values were used to calculate utilities according to each country's tariff (for example, 11232 was converted using each country's model), it is likely that any differences observed in the analysis were a result of significant differences in the valuation models rather than in the absolute 5-dimensional score.Another limitation of our study is that our survey generally described poor health states, which could amplify any differences between the models generated by the valuation studies.To validate our results, future studies should be expanded to analyze both poor and good health states.
We hope that the findings of our study will encourage other investigators and policymakers to revisit previous models and decisions that used non-country-specific tariffs.Canadian analyses that used U.K. tariffs were sensitive to utilities, and any studies that deemed that an intervention to be cost-ineffective should be reassessed because of the lower utility scores that would result from the use of U.K. instead of Canadian tariffs.Countries lacking their own valuation studies to determine EQ-5D-3L tariffs should strongly consider developing their own valuation studies rather than using tariffs from other countries when performing their cost-effectiveness analyses to inform policymaking.Given that most of the obvious differences in utilities appeared to occur for poor health states with low utilities (or even negative utilities), models describing health conditions involving poor health states-such as metastatic cancer or chronic severe cardiovascular diseases, including severe heart failure and stroke-are potentially more susceptible to this issue.Additionally, because of the rapidly rising costs of oncology drugs and increased reliance by many jurisdictions internationally on economic evaluations in drug funding decisions, our study assists oncologists-who are becoming more interested in learning the details of economic evaluations-by providing more knowledge so that they can engage in the discussion of drug assessment and funding.

α
= intercept [indicator for any movement away from perfect health (11111)]; D1 = number of dimensions at level 2 or level 3 beyond the first; I2 = number of dimensions at level 2 beyond the first; I3 = number of dimensions at level 3 beyond the first; N3 = level 3 occurs within at least one dimension; NA = not available.

FIGURE 1
FIGURE 1Canadian compared with U.S. and U.K. mean utilities.

TABLE II
Background characteristics for time trade-off elicited country valuation studies

TABLE III
Mean utility scores obtained from the EQ-5D-3L survey a Boldface type indicates values that, using the Bonferroni correction (p < 0.000625), are significantly different from Canadian values.bValues used in the original study.c

TABLE IV
Quality-adjusted life years (QALYs) for metastatic pancreatic cancer treatments based on country-specific EQ-5D-3L tariffs

TABLE V
Results of the cost-effectiveness analysis expressed using the incremental cost-effectiveness ratio (ICER)

TABLE VI
Percentage probability of achieving the highest net monetary benefit at specified willingness-to-pay (WTP) thresholds Current Oncology, Vol.22, No. 6, December 2015 © 2015 Multimed Inc.