Prognostic and Predictive Biomarkers in Head and Neck Squamous Cell Carcinoma Treated with Radiotherapy—A Systematic Review

Background: Radiotherapy is a mainstay in head and neck squamous cell carcinoma (HNSCC) treatment but is mostly applied without stratification by molecular diagnostics. Development of reliable biomarkers may have the potential to improve radiotherapy (RT) efficacy and reduce toxicity. We conducted a systematic review to summarize the field of biomarkers in HNSCC treated by RT. Methods: Pubmed and EMBASE were searched independently by two researchers following pre-defined inclusion and exclusion criteria. Z curves were generated to investigate publication bias. OncoKB was used for identification of druggable targets. Results: 134 manuscripts remained for data extraction. 12% of tumors were AJCC/UICC stage I–II and 82% were stage III–IV. The most common biomarkers were proteins (39%), DNA (14%) and mRNA (9%). Limiting analysis to prospective data and statistically significant results, we found three potentially druggable targets: ERCC2, PTCH1 and EGFR. Regarding data quality, AJCC/UICC stage was missing in 32% of manuscripts. 73% of studies were retrospective and only 7% were based on prospective randomized trials. Z-curves indicated the presence of publication bias. Conclusion: An abundance of potential biomarkers in HNSCC is available but data quality is limited by retrospective collection, lack of validation and publication bias. Improved study design and reporting quality might accelerate successful development of personalized treatments in HNSCC.


Introduction
Squamous cell carcinoma of the head and neck (HNSCC) remains the 6th most common cause of cancer worldwide [1]. In recent years, an increasing number of publications emerged, aiming to identify prognostic and predictive biomarkers but HNSCC is still a tumor entity characterized by a paucity of personalized treatments. The only targeted therapy approved in the USA and European Union in the curative treatment setting is cetuximab, an EGFR-binding antibody whose mechanism of action has been suggested to be driven in large parts by triggering immunologic anti-tumor reactions and not predominantly by inhibition of the EGFR pathway [2]. Therefore, in the curative setting, HNSCC is still mostly tackled by surgery and radiotherapy (RT; with or without concomitant systemic therapy) without a molecular stratification to choose an escalated or de-escalated strategy, as is common in other oncological diseases (reviewed in Kerr et al. [3], Deacon et al. [4]). Development of reliable and affordable biomarkers is therefore an eminent concern to improve HNSCC treatments.
A recent systematic review reported on clinical and biological biomarkers in HNSCC and summarized a total of 86 studies, regardless of applied therapy and treatment intent [5]. However, treatment approach might be an important consideration because it is inherently linked with differing patient characteristics and outcomes. Patients receiving either surgery or radiotherapy as primary treatment differ significantly in clinical features, especially tumor stage and anatomical location [6]. Additionally, RT is directly affected by multiple biological processes such as hypoxia or tumor stem cells. Oxygen is among the strongest modifiers of radiosensitivity, expressed in a tumor response approximately 2.5 to 3 times greater (oxygen enhancement ratio) if sufficient oxygen is present at the time of radiation or very shortly thereafter. This has initially been explained as the result of oxygen-induced fixation of radiation-induced DNA damage by free radicals [7]. However, recent research supports additional mechanisms, e.g., modulation of angiogenesis, cell stress responses and immune response (reviewed in Sørensen et al. [8]). Recently, hypoxia-based stratification of HNSCC patients treated with RT has entered clinical trials in an attempt to leverage this effect for clinical benefit [9]. Cancer stem cells (CSC) are another biological factor influencing RT outcome. It was shown that the presence of the putative CSC marker CD44 is associated with an increased risk of local recurrence in patients with laryngeal cancer [10]. This effect has been confirmed in two publications by the German Cancer Consortium Radiation Oncology Group in the setting of primary radio-chemotherapy in locally advanced HNSCC, and post-operative RT after surgical resection of HNSCC, respectively [11,12]. Similar to hypoxia, stratification of patients by stem cell status might be a promising strategy in HNSCC after confirmation of this approach in prospective clinical trials. Lastly, ionizing radiation itself also acts as a mutation-inducing agent and might affect biomarkers, e.g., by modulation of the tumor immune microenvironment [13,14].
In summary, RT is one of the main treatment modalities for HNSCC and its effects are directly affected by tumor biology. Our analysis therefore focuses exclusively on patients who received RT as part of their treatment with predominantly curative intent and strives to give an overview over the field of biomarkers in HNSCC from a quantitative and qualitative perspective. We omitted publications involving HPV as a prognostic biomarker and manuscripts on circulating HPV-and EBV-DNA as these topics have already been discussed in several systematic reviews and meta-analyses [15][16][17][18].

Materials and Methods
We searched Pubmed for available literature using the following search terms: ). To find additional literature, we also searched EMBASE using analogous search terms. Abstracts were independently screened by two physicians with experience in the field of HNSCC, radiation oncology and preclinical science. Inclusion criteria were: (1) HNSCC, (2) ≥10 patients, (3) association of biomarker with relevant oncological endpoint provided, (4a) all patients treated with RT, or (4b) separate outcome reported for the subgroup of patients treated with RT, (5) publication date between 1 January 2010 and 1 March 2022. Exclusion criteria were: (1) Reports solely focusing on circulating HPV or EBV DNA, and studies exclusively focusing on HPV-status as a biomarker because a plethora of recent literature, including systematic reviews and meta-analyses are already available on these topics [17][18][19]). (2) studies solely based on publicly available databases (TCGA, NCDB), (3) narrative and systematic reviews. In ambiguous cases, consensus between both researchers was established to include or exclude the respective study.
Data was then extracted from the full-text version of remaining entries. Studies including patients not treated with RT were handled as follows: the study was excluded if no separate outcome parameters were reported for the RT group (as per exclusion criteria). Patient and tumor characteristics were gathered separately for the RT group, if provided in the manuscript. Otherwise, characteristics for the whole cohort, including patients not treated with RT, were extracted. Whenever available, multivariable tests were preferred over univariable ones. All data analysis was performed in R (v4.1.0) with packages: dplyr, ggplot2. Z curves were generated following the method described by Brunner, Schimmack and Bartoš [20,21], employing the R package zcurve. Literature search, data extraction, analysis and reporting were performed in accordance with the PRISMA and COSMOS-E statements [22,23]. The review was registered in the International Prospective Register of Systematic Reviews (PROSPERO) with the ID CRD42022365752. For identification of potentially druggable targets, data from OncoKB was used, including items with a therapeutic level ≥ 3 (date of data request: 13 June 2022) [24]. Ongoing clinical trials for biomarkers were identified via https://clinicaltrials.gov (date of access: 27 November 2022).

Literature Search and General Characteristics
We screened 5005 publications from two databases and excluded 477 based on the inclusion and exclusion criteria. The 234 remaining studies were further evaluated for data quality and inclusion criteria in the full-text version, after which 134 manuscripts remained for data extraction (Figure 1). Median publication year of included studies was 2016 and there was no clearly discernible increase or decrease in the number of published manuscripts per year over time ( Figure 2A). Studies had included a median of 100.5 patients but encompassed a large range spanning from 11 to 578 ( Figure 2B). topics [17][18][19]). (2) studies solely based on publicly available databases (TCGA, NCDB), (3) narrative and systematic reviews. In ambiguous cases, consensus between both researchers was established to include or exclude the respective study. Data was then extracted from the full-text version of remaining entries. Studies including patients not treated with RT were handled as follows: the study was excluded if no separate outcome parameters were reported for the RT group (as per exclusion criteria). Patient and tumor characteristics were gathered separately for the RT group, if provided in the manuscript. Otherwise, characteristics for the whole cohort, including patients not treated with RT, were extracted. Whenever available, multivariable tests were preferred over univariable ones. All data analysis was performed in R (v4.1.0) with packages: dplyr, ggplot2. Z curves were generated following the method described by Brunner, Schimmack and Bartoš [20,21], employing the R package zcurve. Literature search, data extraction, analysis and reporting were performed in accordance with the PRISMA and COSMOS-E statements [22,23]. The review was registered in the International Prospective Register of Systematic Reviews (PROSPERO) with the ID CRD42022365752. For identification of potentially druggable targets, data from OncoKB was used, including items with a therapeutic level ≥ 3 (date of data request: 13 June 2022) [24]. Ongoing clinical trials for biomarkers were identified via https://clinicaltrials.gov (date of access: 27 November 2022).

Literature Search and General Characteristics
We screened 5005 publications from two databases and excluded 477 based on the inclusion and exclusion criteria. The 234 remaining studies were further evaluated for data quality and inclusion criteria in the full-text version, after which 134 manuscripts remained for data extraction (Figure 1). Median publication year of included studies was 2016 and there was no clearly discernible increase or decrease in the number of published manuscripts per year over time ( Figure 2A). Studies had included a median of 100.5 patients but encompassed a large range spanning from 11 to 578 ( Figure 2B).

Data Quality and Missingness
A relevant part of extracted variables was incompletely documented or could not be determined unequivocally from the description in abstract or full text. In 8.2% of publications, outcomes were reported for a group of patients treated with RT but clinical data was only available for the cohort as a whole (i.e., including patients who did not receive RT). The version of the used TNM classification was not mentioned in 61% of studies and AJCC/UICC stage was missing in 32%. Sufficient information about RT dose was not provided in 28% of publications and there was no information on smoking and alcohol consumption in 54% and 80% of reports, respectively. HPV-status was fully or partially missing in 66% of publications, translating to 47% of included patients. Regarding statistical analysis, 34% of extracted correlations between biomarker and outcome were based only on univariable analyses. To address publication bias, we generated z-curves for the most commonly used endpoints overall survival (OS), progression-free survival (PFS) and locoregional control (LRC) ( Figure S1). We could demonstrate a marked discrepancy between the observed discovery rate (ODR) and the expected discovery rate (EDR) for all endpoints, indicating the presence of publication bias. Moreover, z-curves for OS and PFS demonstrated a difference in density immediately around z-score 1.96 (corresponding to p = 0.05), arguing for an overabundance of just-significant compared to just-non-significant results.

General Description of Included Studies
Among included studies, 7% were analyses of patient data from prospective randomized trials and another 10% from prospective, non-randomized interventional trials (Figure 3A). In 7% of publications, authors described the source of collected patient data as prospective but did not provide information about a potential study protocol or quality of patient follow-up and were hence classified as "self-proclaimed prospective". Anatomically, the most commonly reported primary tumor site was oropharynx (46%), followed by larynx (18%), oral cavity (17%) and hypopharynx (11%) ( Figure 3B). HPV-status was positive in 27% of all patients in studies with complete data, including two studies focusing only on HPV-positive cases. RT was described as the primary definitive treatment in 46%, and as adjuvant in 30% of publications; the remainder described mixed cohorts with

Data Quality and Missingness
A relevant part of extracted variables was incompletely documented or could not be determined unequivocally from the description in abstract or full text. In 8.2% of publications, outcomes were reported for a group of patients treated with RT but clinical data was only available for the cohort as a whole (i.e., including patients who did not receive RT). The version of the used TNM classification was not mentioned in 61% of studies and AJCC/UICC stage was missing in 32%. Sufficient information about RT dose was not provided in 28% of publications and there was no information on smoking and alcohol consumption in 54% and 80% of reports, respectively. HPV-status was fully or partially missing in 66% of publications, translating to 47% of included patients. Regarding statistical analysis, 34% of extracted correlations between biomarker and outcome were based only on univariable analyses. To address publication bias, we generated z-curves for the most commonly used endpoints overall survival (OS), progression-free survival (PFS) and locoregional control (LRC) ( Figure S1). We could demonstrate a marked discrepancy between the observed discovery rate (ODR) and the expected discovery rate (EDR) for all endpoints, indicating the presence of publication bias. Moreover, z-curves for OS and PFS demonstrated a difference in density immediately around z-score 1.96 (corresponding to p = 0.05), arguing for an overabundance of just-significant compared to just-non-significant results.

General Description of Included Studies
Among included studies, 7% were analyses of patient data from prospective randomized trials and another 10% from prospective, non-randomized interventional trials ( Figure 3A). In 7% of publications, authors described the source of collected patient data as prospective but did not provide information about a potential study protocol or quality of patient follow-up and were hence classified as "self-proclaimed prospective". Anatomically, the most commonly reported primary tumor site was oropharynx (46%), followed by larynx (18%), oral cavity (17%) and hypopharynx (11%) ( Figure 3B). HPV-status was positive in 27% of all patients in studies with complete data, including two studies focusing only on HPV-positive cases. RT was described as the primary definitive treatment in 46%, and as adjuvant in 30% of publications; the remainder described mixed cohorts with both treatment intents. Analysis of TNM and AJCC/UICC stage was limited by the frequent unavailability of the used classification system's version. Disregarding this restriction, only a minority of patients (12%) were early AJCC/UICC stage (I-II), whereas 81% were advanced stage (III-IV) and information was missing for 7%. Regarding RT, a multitude of fractionation schedules was employed, and reporting was frequently done in the format of a range. We therefore attempted to calculate the minimum applied equivalent dose 2 Gy (EQD2) that was described in each publication, resulting in a median of 66 Gy (range: 14 Gy-72 Gy). RT was mostly used in the primary or adjuvant setting but multiple publications also reported the use of both concepts in included patients ( Figure 3C) Information about concomitant systemic therapy was missing in 16% of studies; the remaining publications reported application of a diverse range of regimens, including platinum compounds, nimorazole, mitomycin C, 5-fluorouracil, cetuximab, combinations thereof and others. Overall, in studies where information was available, 74% of patients received concomitant systemic therapy. Concerning endpoints, OS (33%) was the most frequently used, followed by LRC (17%), PFS (16%) and disease-free survival (DFS, 9%) ( Figure 3D). A list of endpoint is provided in Abbreviations part.
Biomedicines 2022, 10, x FOR PEER REVIEW 5 of 15 both treatment intents. Analysis of TNM and AJCC/UICC stage was limited by the frequent unavailability of the used classification system's version. Disregarding this restriction, only a minority of patients (12%) were early AJCC/UICC stage (I-II), whereas 81% were advanced stage (III-IV) and information was missing for 7%. Regarding RT, a multitude of fractionation schedules was employed, and reporting was frequently done in the format of a range. We therefore attempted to calculate the minimum applied equivalent dose 2 Gy (EQD2) that was described in each publication, resulting in a median of 66 Gy (range: 14 Gy-72 Gy). RT was mostly used in the primary or adjuvant setting but multiple publications also reported the use of both concepts in included patients ( Figure  3C) Information about concomitant systemic therapy was missing in 16% of studies; the remaining publications reported application of a diverse range of regimens, including platinum compounds, nimorazole, mitomycin C, 5-fluorouracil, cetuximab, combinations thereof and others. Overall, in studies where information was available, 74% of patients received concomitant systemic therapy. Concerning endpoints, OS (33%) was the most frequently used, followed by LRC (17%), PFS (16%) and disease-free survival (DFS, 9%) ( Figure 3D). A list of endpoint is provided in Abbreviations part.

Biomarkers
We standardized biomarker names by replacing reported terms with HUGO symbols, where appropriate, and classified them in broad categories ( Figure 4A). By far the largest category were proteins (39%), frequently detected by immunohistochemistry staining. Other common categories were DNA (14%), often assessed for mutations and copy number alterations by sequencing; mRNA (9%), commonly measured by reversetranscription polymerase chain reaction (RT-PCR) or probes; and hematological markers (7%), subsuming measurements of red or white cells, or hemoglobin. Immune cells (7%)

Biomarkers
We standardized biomarker names by replacing reported terms with HUGO symbols, where appropriate, and classified them in broad categories ( Figure 4A). By far the largest category were proteins (39%), frequently detected by immunohistochemistry staining. Other common categories were DNA (14%), often assessed for mutations and copy number alterations by sequencing; mRNA (9%), commonly measured by reverse-transcription polymerase chain reaction (RT-PCR) or probes; and hematological markers (7%), subsuming measurements of red or white cells, or hemoglobin. Immune cells (7%) included tumor-infiltrating lymphocytes or other cells of immunological lineage infiltrating tumor tissue. Finally, we also assigned the categories serum/plasma proteins (e.g., albumin); and Biomedicines 2022, 10, 3288 6 of 14 signatures, consisting of a combination of biomarkers. Regarding the tissue that biomarkers were extracted from, two-thirds (67%) of samples were biopsies or surgical specimens from tumors ( Figure 4B). Only 23% were blood samples and a smaller share consisted of germline DNA (often from whole blood) and saliva. We further analyzed the number of biomarkers per included publication and found that the majority (51%) of papers dealt with one biomarker, with a marked fall-off at two (19%), three (12%) and four or more (18%) ( Figure 4C). Lastly, we calculated the number of independent publications for each biomarker in our dataset and could demonstrate that the majority (82%) had only been assessed in one publication. Only 10% were described in two, and 8% in three or more studies ( Figure 4D). or more (18%) ( Figure 4C). Lastly, we calculated the number of independent publications for each biomarker in our dataset and could demonstrate that the majority (82%) had only been assessed in one publication. Only 10% were described in two, and 8% in three or more studies ( Figure 4D).
When assessing individual biomarkers that were reported to have a statistically significant association to outcomes and provided hazard ratios, the strongest positive predictors of local and locoregional tumor relapse was FGF2 (LRC, HR = 7.33, p < 0.001), detected in a retrospective study. However, when considering only prospectively collected data and multivariable tests, FANCC (LRR, HR = 6.6, CI [  .07], p = 0.0001) showed the strongest associations. Next, we correlated biomarkers found in our dataset with the OncoKB Precision Oncology Knowledge Base [17] and found three potential targets within therapeutic levels 1 to 3 (i.e., excluding only preclinical evidence, not considering disease-specificity or type of alteration): ERCC2, PTCH1 and EGFR (Table 1). A complete summary of statistically significant, multivariable tests in prospective studies, associated OncoKB data, and ongoing clinical trials involving respective markers (clinicaltrials.gov, as of 27 November 2022) is provided in Table 1, and a full list of included studies in Supplementary Table S1.  When assessing individual biomarkers that were reported to have a statistically significant association to outcomes and provided hazard ratios, the strongest positive predictors of local and locoregional tumor relapse was FGF2 (LRC, HR = 7.33, p < 0.001), detected in a retrospective study. However, when considering only prospectively collected data and multivariable tests, FANCC (LRR, HR = 6.6, CI  [17] and found three potential targets within therapeutic levels 1 to 3 (i.e., excluding only preclinical evidence, not considering disease-specificity or type of alteration): ERCC2, PTCH1 and EGFR (Table 1). A complete summary of statistically significant, multivariable tests in prospective studies, associated OncoKB data, and ongoing clinical trials involving respective markers (clinicaltrials.gov, as of 27 November 2022) is provided in Table 1, and a full list of included studies in Supplementary Table S1.

Discussion
Biomarkers have gained an ever-increasing role in the treatment of a wide spectrum of oncological diseases in recent years [3,4]. Treatment is now regularly guided by detection of gene expression or mutations, leading to an increasingly personalized approach. So far, HNSCC has not experienced the progress that other tumor entities have made in this regard. The aim of this work was therefore not only to summarize the available literature but also to identify potential obstacles that hinder progress in this field.
In our analysis, we could find a broad palette of biomarkers, ranging from well-known genes to lesser-studied targets. Among markers from higher-quality studies and analyses, we identified three targets that are potentially druggable according to OncoKB: (1) EGFR is a member of the ErbB family of receptors and binds to several of the epidermal growth factor (EGF) family of ligands. The EGFR pathway is involved in a multitude of cellular processes in cancer cells, most prominently the promotion of cell growth, division, migration and cell survival (reviewed in Normanno et al. [25]). EGFR has been studied extensively in the context of HNSCC and is an established clinical target with evidence of efficacy from randomized-controlled clinical trials [26]. (2) ERCC2, a DNA helicase that is involved in nucleotide excision repair [27]. Genetic alterations of this gene have been linked to increased chemotherapy-sensitivity, especially to cisplatin [28][29][30]. Interestingly, ERCC2 has also been implicated in a recent meta-analysis in higher tumor stage and grade, and a positive correlation with Ki-67 in HNSCC, suggesting a more aggressive tumor phenotype [31]. However, the hazard ratio (HR) of ERCC2 ranged between 0.42 and 2.07 in the three respective publications (Table S1) in our dataset, making a definitive interpretation difficult.
(3) PTCH1, a member of the hedgehog signaling pathway with important roles in embryonic development and tumorigenesis (reviewed in Villavicencio et al. [32]). PTCH1 inactivation has been identified in early dysplastic lesions of the head-and-neck region [33] and is thought to be one of the main causes of nevoid basal cell carcinoma (BCC) syndrome, an autosomal dominant disease characterized by frequent BCC [34]. The relevant publication from our dataset assessed deletion and downregulation by methylation of PTCH1 and associated this with an increased risk for locoregional recurrence. The hedgehog pathway is druggable by sonidegib and vismodegib, both clinically approved for the treatment of BCC [35]. Another potentially druggable biomarker from a prospective cohort but without a reported HR in our analysis was the mutated variant of KRAS, G12C. A clinically approved treatment targeting this mutation in non-small cell lung cancer (NSCLC) [36] is available but unfortunately, the respective publication in our analysis did not indicate the specific KRAS mutation linked to the reported worse LRC. Prevalence of the G12C mutation has been estimated to be 0.24% in HNSCC [37]. A reliable statement about the mechanism of action (e.g., radio-or chemosensitization) of the four reported markers is not possible within the scope of this work, as none of the studies was designed to prove a causative, mechanistic relation between marker and an individual treatment.
Generally, prognostic markers are defined as characteristics that can be used to determine the probability of a pre-defined clinical event. On the other hand, predictive biomarkers allow to estimate an effect of exposure to an external factor, e.g., tumor response to a specific therapeutic intervention [38]. Considerable overlap exists between both categories and a distinction is only possible if there are two groups of patients, one with and one without the trait of interest. Additionally, a biomarker might be both prognostic and predictive. The most notable example for adoption of prognostic markers in HNSCC is usage of HPV-status for clinical and pathological staging [39,40]. There is, however, currently no established clinical consensus how the presence of HPV in HNSCC should affect therapeutic decisions as a predictive marker, but multiple clinical studies testing a de-escalation of systemic therapy or radiotherapy treatment have been published, recently, and further progress in this field is to be expected in the coming years [41][42][43]. Currently, no predictive markers are in routine clinical use for HNSCC with the exception of PD-L1 status and the combined positive score (CPS) before administration of pembrolizumab for palliative treatment [6]. Several studies are ongoing to establish predictive markers in the setting of radiotherapy in HNSCC, prominently the phase-III randomized DAHANCA-30 trial that tests administration of the radiosensitizer nimorazole, based on a hypoxia gene profile also included in our dataset (Table 1) [9,44]. Other significant efforts include development of the genomic-adjusted radiation dose (GARD) that was originally published in 2017 [45] and validated in HNSCC in a study in our dataset (HR = 4.9 for LRC, CI [1.97-12.3]) [46]. A recent pan-cancer validation study again found an overall effect on time to first recurrence but the subgroup of HNSCC demonstrated a small effect size and did not reach statistical significance (HR = 0.99, CI [0.96-1.01]) [47]. Among publications included in our study, a clear distinction between prognostic and predictive marker was often not possible because treatment was not uniformly applied to all patients, clinical data was missing, and most studies were not designed to demonstrate this difference.
When comparing our results with the recent systematic review of Budach et al., we identified 25 common biomarkers out of 246 in our dataset [5]. A certain overlap is expected between the two works because both are systematic reviews of biomarkers in HNSCC. However, our analysis focuses exclusively on patients receiving RT, covers a different time period and has set different inclusion criteria (e.g., minimum number of reported patients), thereby explaining the marked difference in results.
Our analysis has several strengths, including the systematic approach and a reasonably sized dataset of 134 included studies. We extracted not only outcomes but also quantifiable results such as hazard ratios and p-values, allowing some relevant analyses, e.g., of publication bias. One weakness of our study originates in the quality of data that is available in the literature. Our results show that more than 70% of included publications had been performed in a retrospective setting. While it is to be expected that hypothesisgenerating studies are often done post-hoc, the inevitably introduced selection bias and lack of statistical power make follow-up studies necessary to confirm initial results. However, another finding was the lack of confirmatory and validation studies. While some included manuscripts had internal validation cohorts or were follow-up studies of earlier results, the majority of them reported isolated findings, leading to a large number of unvalidated hypotheses. This might in part be caused by publications not encompassed by our search terms but adding to this data, we found indications for marked publication bias in included studies in all three analyzed endpoints. In recent years, some authors have described a "reproducibility crisis" in various branches of scientific work with a large swath of results not being able to be replicated by independent research groups [48][49][50]. With the presented data in mind, it seems conceivable that the field of biomarkers in HNSCC might be at risk to be affected by this phenomenon.
In addition, another problem of our analysis was the unstructured reporting of included publications. Data was frequently categorized in arbitrary groups (e.g., different lumping of T or N stages) or only partially reported. In other cases, vital information was simply missing, such as the AJCC/UICC staging version, which has changed multiple times in recent decades with significant modifications over time, making a comprehensive comparison of included patient cohorts nearly impossible. Furthermore, the most frequently reported outcome was OS, which, as a composite endpoint, might not be appropriate for biomarkers if tumor response to a local or regional therapeutic regimen is to be measured. HNSCC patients regularly suffer from numerous comorbidities and severe treatment toxicity [51], acting as a competing risk for OS. Consequently, it has been reported in HNSCC that LRC does not consistently translate into OS [52].
Despite likely being the most widely-studied molecular marker in HNSCC, we omitted publications focusing on HPV as a prognostic marker for this work. HPV status has been of increasing importance in the clinical setting since the seminal analysis of the RTOG 0129 cohort [53] and has been introduced into the 8th UICC/AJCC staging system. Due to its preeminent stance among prognostic HNSCC markers, an abundance of publications on this topic is available, which has already spurred multiple systematic reviews and metaanalyses [15,16]. Including this large number of publications would have gone beyond the scope of this work. On similar grounds, we also excluded reports solely studying circulating HPV-or EBV-DNA, as these have been the subject of recent systematic reviews [17][18][19]. Of note, publications included in this review did contain HPV-positive as well as HPV-negative cases, but results were generally not stratified by HPV-status. The differing biology of tumors depending on HPV infection or, alternatively, exposure to other carcinogens might lead to divergent profiles of prognostic and predictive biomarkers, but data quality was not sufficient to discern this potential effect in our work.

Conclusions
In conclusion, the field of prognostic and predictive biomarkers in HNSCC shows a dynamic evolution during recent years and has the potential to significantly impact radiotherapy outcomes by reducing toxicity (e.g., via dose and volume de-escalation) and increasing tumor control (e.g., by exploiting radiosensitizing molecular alterations or increasing radiotherapy dose). However, considering the large number of published exploratory investigations, there is a lack of validation studies and prospective clinical trials to generate reliable, high-quality data to translate these findings into clinical practice. To achieve this, future studies should not only focus on the discovery of a marker or effect but validation in an independent patient cohort, ideally with pre-planned and prospectively collected data from multiple institutions. This will require intensive collaboration between pre-clinical scientists and physicians, beginning from the process of defining a marker of interest up to testing promising candidates in clinical trials. Relevant endpoints should be chosen for these studies, depending on the investigated disease, test or treatment. Additionally, to allow comparison of independently published manuscripts, stringent reporting standards should be maintained, as outlined, for example in the REMARK [54] guidelines. Lastly, the publication of results that are negative, but obtained from highquality studies, would avoid unnecessary duplication of work and thereby enable a more efficient use of resources.

Conflicts of Interest:
The authors declare no conflict of interest.