Are Tumor Marker Tests Applied Appropriately in Clinical Practice? A Healthcare Claims Data Analysis

Tumor markers (TM) are crucial in the monitoring of cancer treatment. However, inappropriate requests for screening reasons have a high risk of false positive and negative findings, which can lead to patient anxiety and unnecessary follow-up examinations. We aimed to assess the appropriateness of TM testing in outpatient practice in Switzerland. We conducted a retrospective cohort study based on healthcare claims data. Patients who had received at least one out of seven TM tests (CEA, CA19-9, CA125, CA15-3, CA72-4, Calcitonin, or NSE) between 2018 and 2021 were analyzed. Appropriate determinations were defined as a request with a corresponding cancer-related diagnosis or intervention. Appropriateness of TM determination by patient characteristics and prescriber specialty was estimated by using multivariate analyses. A total of 51,395 TM determinations in 36,537 patients were included. An amount of 41.6% of all TM were determined appropriately. General practitioners most often determined TM (44.3%) and had the lowest number of appropriate requests (27.8%). A strong predictor for appropriate determinations were requests by medical oncologists. A remarkable proportion of TM testing was performed inappropriately, particularly in the primary care setting. Our results suggest that a considerable proportion of the population is at risk for various harms associated with misinterpretations of TM test results.


Introduction
Tumor markers (TM) have a crucial role in managing various types of cancer.They are utilized to estimate the prognosis, monitor the efficacy of treatment, and detect relapse early [1,2].However, because of their limited sensitivity and specificity, most TM are not appropriate to be used as screening parameters or to clarify non-specific clinical findings [3][4][5].Unjustified determinations may not only limit laboratory resources and increase unnecessary costs for the healthcare system, but can also harm the patient by leading to unnecessary further investigations, interventions, and patient anxiety in case of false positive results.Major medical guidelines have recognized the risks associated with over testing and oppose TM use as diagnostic tool for cancer detection [6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23].A few, mostly outdated, studies conducted in different healthcare settings indicated that TM are frequently misused as screening parameters [24][25][26].The number of appropriate determinations varies considerably with figures as low as five percent [25].It is, however, unknown how TM are used in routine clinical practice in Switzerland.In addition, little is known about predictors of appropriate requests on both the physician and patient sides.Therefore, we aimed to, first, determine the frequency of a predefined set of TM, second, determine the proportion of TM appropriately applied as defined by a laboratory test with a respective cancer-related disease and intervention (CDI) and, third, investigate the appropriateness of TM determination according to the prescriber specialty and patient characteristics using a large Swiss claims database.

Study Design and Setting
We conducted a retrospective cohort study using healthcare claims data from the Helsana Health Insurance Group (Helsana).Helsana is one of the largest health insurance companies in Switzerland, covering an average of around 1,395,000 mandatory insured patients from all parts of the country, which corresponds to around 15% of the Swiss population.Linked at the patient level and based on healthcare invoices for reimbursement, the database includes longitudinal information about patients' sociodemographics, medications prescribed, laboratory tests received, use of outpatient and inpatient healthcare, and associated costs.The content of the obligatory health insurance in Switzerland is regulated by law.Therefore, there are no regulatory differences between Helsana and other Swiss health insurances companies.Several previous studies focusing on different contexts strongly suggest the included study population (Helsana insurance collective) to be representative of the Swiss population and validated the high quality and completeness of the insurance's database [27][28][29].
We studied persons aged 18 years or older with mandatory health insurance at Helsana between 2018 and 2021, who had received at least one TM test in the given years.The first determination of any TM in this period was set as the patient's index date.Additional TM determinations after the index date or within the look-back period were not included in further analysis.To determine whether the TM determination was appropriate, the five years preceding and the following 12 months after the index date were examined for a CDI (antineoplastics, diagnosis, radiooncological therapy, in-or outpatient operations, histology, CT-thorax-abdomen for cancer staging, and two or more visits to an oncologist).Patients were excluded if they were not insured with Helsana during the entire look-back and follow-up period.Patients were classified under the term "Cancer diagnosis (CD) reliable" if they had any of the following CDI: prescription of antineoplastic agents, a diagnosis related to the single TM or a radio oncological therapy.Approximation codes for "CD probable" were an in-or outpatient surgical procedure probably associated with cancer, e.g., a rectal resection for rectal cancer.Approximation codes for "CD conceivable" were codes for histology, radiological procedures associated with cancer staging, namely CT-thorax and abdomen contemporaneous, or two or more visits to the oncologist.Patients with none of the listed codes were classified as "No CD".The categories "CD reliable" and "CD probable" were considered as "appropriate request", and "CD conceivable" and "no CD" as "inappropriate request".Table S1 lists all CDI assigned to the related TM. Figure S1 shows the flow chart and classification of CDI.

Testing of TM
The TM tests were selected from the list of analyses (AL) from the Federal Office of Public Health [30].The AL contains all laboratory tests which are covered by the mandatory health insurance in Switzerland.The markers were selected due to their broad use and their (in most cases) unique assignment to a cancer entity.

Statistical Analysis
Descriptive statistics were calculated for the patient characteristics, proportion and appropriateness of TM use, cost analysis and physicians' specialization.The direct laboratory costs for the TM were taken from the health insurance bills of each patient.The costs are shown in Swiss francs as well as in Euro and US dollar to enable international comparison.The currency conversion was based on the annual average rate given by the Swiss tax authorities [41].
Appropriateness of TM testing was examined with a binomial regression model with "logit" link function.Statistical significance was set at the 0.05 level.
All analyses were performed using the statistical program R, version 4.2.2 (R Foundation for Statistical Computing, Vienna, Austria).

Frequency of TM Determinations
In total, 205,160 determinations of TM were detected in 51,780 patients between 2018 and 2021 in the Helsana population.After exclusion of patients who were not continuously insured for the total time of the look-back and follow-up period and only counting the first TM determination per patient (index date), 51,395 determinations of TM were detected in 36,537 patients between 2018 and 2021.

Number of TM Determinations per Patient at Index Date
Analysis showed that most patients had one TM determined at the index date (25,728, 70.4%), 7838 patients (21.5%) had two different TM, and 2971 (8.1%) had three or more TM determined.The mean number of TM determined in one patient at index date was 1.41 (Table S2).

Detected CDI and Appropriateness of TM Requests
For all TM combined, 28.5% (10,425) of patients received antineoplastic medication, most frequently those in which the TM CA15-3 for breast cancer was determined (4301, 53.0%) and least frequently the ones with Calcitonin for medullary thyroid cancer (153, 10.8%).An average of 21.0% (7669) of the patients had a predefined ICD-10 diagnosis, again most of the patients with the TM CA15-3 (2881, 35.5%).An intervention billed using a predefined outpatient operation code was only rarely detected in 0.7% during the lookback or follow-up period of the single TM (243 patients), whereas a predefined inpatient operation code was found in 20.8% (7594) of all patients.A code for histology was found in 71.0%(25,939) of all patients.21.0% (7682) of all patients had no evidence of cancer (Table 2).

Cost Analysis
The total cost for all 36,537 TM were CHF 1,205,842 (EUR 1,096,220, USD 1,262,661).The proportion of costs for appropriately determined TM was 36.5%, corresponding to CHF 440,370 (EUR 400,336, USD 461,120).

TM Determination According to Physicians' Specialization
Among the specialists, general practitioners (GP) most frequently requested at least one of the seven TM (16,199,44.3%),followed by the group "others" (5843, 16.0%) and gynecologists (5048, 13.8%) (Table 4).a Number, percent, percent appropriate request.b Specialists ranked after the top 5 nominated ones were subsumed under "Others".c Tertiary hospitals are central providers, in Switzerland defined by the treatment of more than 9000 inpatient cases per year and a sum of more than 20 training categories.d Group practices are defined by their organization (e.g., shared premises) and not by the specialty of the physician involved.It is not possible to deduct the specialty from our data.The service providers were categorized using the classification of the Swiss paying agent register (created by SASIS AG).

Multivariate Analysis on Appropriateness
Table 5 shows the multivariate analysis for appropriateness for each TM.Strong predictors for appropriate determinations across all individual TM were requested by oncologists or tertiary hospitals.For example, oncologists determined CEA in reference to GP appropriately with an odds ratio of 11.2 (95% confidence interval (CI) 9.93, 12.6, p < 0.001).Language region (French and Italian) was in most TM associated with a risk for less appropriate determination, whereas type of deductible was not.Age was a slightly positive predictor for appropriate determination in six out of the seven analyzed TM.

Discussion
To the best of our knowledge, the present study is the first to evaluate appropriateness of TM determination in a large cohort based on healthcare claims data, and additionally to analyze patient and prescriber characteristics, as well as the first to evaluate TM determinations in Switzerland.
The study reveals the following key results: First, only 41.6% of all TM determinations were classified as appropriate requests.The mean number of different TM requests at a time was 1.41.These findings suggest that a substantial proportion of determinations are made as part of screening tests or to clarify non-specific clinical findings.
Although comparability poses difficulties, e.g., due to different markers examined in different study populations as well as various underlying guidelines, the results are still in line with previous, mostly older, studies: For example, Ntaios et al. analyzed TM requests retrospectively in their hospital and found a total of 9782 inappropriate TM orders in a ten-month period during 2008; for the TM CA125, AFP, CA19-9, CYFRA21-1 and NSE, the adequate requests were under 10% [42].Moreno et al. [43] analyzed laboratory requests from the University Hospital of Padua.In the two-year study period between 2011 and 2013, 23,059 analytical requests of TM were analyzed, and 39.9% were classified as appropriate.The mean number of TM requested was 2.4 and 26.6% of requests ordered four or more TM at a time.Arioli et al. [25] interestingly found that only a five percent minority of TM requests was appropriate in their department of Internal Medicine in the Hospital in Modena.A more recent study from 2020 revealed similar results: In a teaching hospital only 12.9% of TM requests had an underlying cancer diagnosis [44].Studies in the outpatient setting also showed comparable results.For example, in a study published in 2017, Gion et al., analyzed electronic health records of a Local Health Authority and found that 59.2% of the 52,536 outpatients for whom a TM was ordered were without a cancer code.A mean of 1.54 TM per person was ordered [45].
Second, most TM determinations in the outpatient setting were requested by GP (16,199, 44.3%), followed by the group "Others" (5843, 16.0%), gynecologists (5048, 13.8%), and tertiary hospitals (4582, 8.8%).This result is of particular interest since there are only very few studies examining partially this issue.For example, a Brazilian retrospective analysis based on healthcare claims data from 2010 to 2017 examined the medical specialty of ordering physicians and found that, interestingly, most of the physicians were cardiologists (23.9%) [46].However, numbers of insured patients were rather low and only 1112 TM tests were analyzed in the whole period.In our study, GP least frequently ordered TM adequately (27.8%).Most appropriate determinations were requested by medical oncologists (77.3%).Age had a statistically significant but very small effect.Interestingly, we detected 5874 (16.1%) of TM requests in patients older than 79 years.Despite the increased life expectancy, an inadequate TM determination at this age seems even more questionable due to the often lacking (therapeutical) consequences.Guideline based screening interventions such as coloscopy or gynecological pap smear have regularly set age limits.Thus, screening or clarifying non-specific findings using TM, which are not evidence-based, indicate a high degree of inappropriateness in these elderly patients.Moreover, we detected 155 male patients with a CA125 determination, a marker for ovarian cancer, that is 1.7% of all CA125 requests.Although, some cases might be explained by inadvertent and accidental requests, there are previous findings showing that up to 33% of patients with CA125 determinations are of male gender [47,48].This large group of inappropriate requests could be caused by using lab block testing as a screening tool.The rare diagnoses that might justify a determination in male patients, e.g., para testicular papillary carcinoma [49], probably cannot explain all of these requests.Laboratory forms with ready-made block orders are especially questionable in these cases.Schulenburg-Brand et al. [47] have-in addition to training measures-set up a laboratory information system that automatically rejected CA125 requests in male patients and found an absolute decrease from 127 to 27 requests.
It could be explained that there are clinical cases where a TM determination is useful even without a coded CDI, such as paraneoplastic syndrome.However, these cases are very rare on the one hand and cannot account for the huge number of inadequate TM determinations, on the other hand [50].
Third, the total costs for index-date TM determinations between 2018 and 2021 were CHF 1,205,842 (EUR 1,096,220, USD 1,262,661).The proportion of costs for appropriately determined TM was 36.8%.Considering a health-economic perspective: The total healthcare costs in Switzerland were CHF 82,472 Mio (EUR 74,299 Mio, USD 83,305 Mio) in 2019 [51].Although the cost for inappropriate TM determination might appear negligible, it should be emphasized that these are only the costs of the laboratory tests, and the costs for possible follow-up interventions are not quantified.Zhang et al. [52] found that inappropriate TM requests accounted for 1.3% to 2.1% of their hospitalization costs.Ntaios et al. [42] found that the total absolute cost for inappropriate TM testing over a 10-month period at their large hospital was EUR 239,748.
Remarkably, the potential damage to patients and their-unneeded-anxiety cannot be quantified.Moreno et al. [43] found that a remarkable 43% of the patients who had a positive result of the TM determinations had no cancer diagnosis, i.e., had a false positive result.For this reason, it is important to create awareness of the damage caused by medical over-and misutilization: Ntaios et al. [42] have therefore thought about a different term for the so-called "TM"; they suggested changing it to "tumor progression markers".

Strengths and Limitations
The present study has some limitations that need to be considered: First, despite the in general systematic coding of CDI, misclassification of CDI cannot completely be excluded.In Switzerland, physicians do not need to code the patient's diagnosis in the outpatient setting.Therefore, for patients whose cancer diagnosis was not made in a hospital, an ICD-10 diagnosis is not available.Second, no data on TM determination in hospitalized patients exists in our healthcare claims data, since laboratory analyses in hospitalized patients are billed by case flat rates.Third, our classification of the TM requests as appropriate or inappropriate is partly rather broad and more defined in guidelines.For example, CA15-3 determination in a patient with localized breast cancer after cancer diagnosis or for surveillance would be classified as appropriate in our analysis, but according to guidelines only metastasized patients should have a determination of CA15-3 [53].Further, oncological-defined ICD-O3 diagnoses are not available in Helsana Group healthcare claims data.Therefore, no specification is possible for the morphological diagnosis of cancer, e.g., medullary thyroid cancer or neuroendocrine carcinoma.Additionally, the billing code for histology represents a very broad approximation parameter: The histological result could be benign or malign, it could have been caused by a dermatological excision or a visceral operation.We therefore classified a proven code under "CD conceivable".Fourth, our data do not include the outcome of the TM determination (e.g., positive, negative, false positive, and false negative).Thus, we cannot measure, for example, follow-up cost due to false positive markers and subsequent investigations.Data collected from hospitals and medical diagnostic centers might offer additional information about the clinical course.However, data are not comprehensive and available nationwide in Switzerland.Further research would be valuable, if both data sources (healthcare claims data and clinical data) could be linked anonymously to provide the most comprehensive information possible.
However, the present study also has several important strengths.First, it is based on a large study population covering about 1.4 million health insurance customers from all over Switzerland.Thus, the study most likely reflects the reality of daily medical routine very well and provides valuable real-world evidence.Second, we could systematically code CDI due to several reimbursement and approximation codes like antineoplastics, inpatient ICD-10 diagnosis, or operation codes.Third, we were able to analyze appropriate requests on a prescriber and patient level.Thus, possible influencing factors for inappropriate TM determinations were considered and deducted.Additionally, study findings provide a solid base for the discussion of public health strategies in order to reduce further inappropriate TM determinations.

Conclusions
According to the present study, inappropriate determination of TM is a major problem in routine medical care, particularly in the primary care setting.Our results suggest a considerable proportion of the population at risk for various harms associated with misinterpretations of TM test results.Efforts to increase awareness among healthcare providers and patients about the potential harm of TM determinations are needed.

Table 1 .
Characteristics of the cohort of patients with tumor marker (TM) determination.

Table 2 .
Cancer-related diseases and interventions (CDI) in the previous 5 years and following 1 year after index tumor marker (TM) determination.
a Cancer Diagnosis.
a Cancer diagnosis.

Table 4 .
Ranking of medical specialties within the group of general practitioners (GP) and outpatient specialists.

Table 5 .
Regression analysis on appropriateness.
a OR = Odds ratio, significant results are in bold; b CI = Confidence interval.
. Number of tumor marker (TM) determination in one patient at index date.Figure S1.Flow Chart: Tumor marker (TM) determination, cancer related diseases and interventions (CDI), and classification towards appropriateness according to the probability of cancer diagnosis (CD).