Health-Related Quality of Life in Oral Cancer Patients: Scoping Review and Critical Appraisal of Investigated Determinants

Simple Summary Oral cancer may strongly impair patients’ quality of life. Huge efforts have been made during recent decades in trying to improve the treatment outcomes in terms of patients’ survival, self-perception, and satisfaction. Consequently, the investigation into health-related quality of life (HRQOL) became an established and worldwide practice. Hundreds of studies tried to clarify which could be the most important variables that impact HRQOL in head and neck cancer patients. However, such a complex topic may be influenced by a multitude of interconnected aspects and several controversies were reported. In this study the current literature was reviewed to identify all those possible sources of bias that may be encountered in trying to correlate HRQOL to patient-specific or disease/treatment-specific aspects. As a result, a list of recommendations was reported to enhance the evidence of future studies. Abstract Background: health-related quality of life (HRQOL) represents a secondary endpoint of medical interventions in oncological patients. Our aim was to highlight potential sources of bias that could be encountered when evaluating HRQOL in oral cancer patients. Methods: this review followed PRISMA-ScR recommendations. Participants: patients treated for oral cancer. Concept: HRQOL assessed by EORTC QLQ-C30 and QLQ-H&N35/QLQ-H&N43. A critical appraisal of included studies was performed to evaluate the accuracy of data stratification with respect to HRQOL determinants. Results: overall, 30 studies met the inclusion criteria, totaling 1833 patients. In total, 8 sociodemographic (SDG) and 15 disease/treatment-specific (DT) HRQOL determinants (independent variables) were identified. The mean number of the independent variables was 6.1 (SD, 4.3)—5.0 (SD, 4.0) DT-related and 1.1 (SD, 1.8) SDG-related variables per article. None of the included papers considered all the identified determinants simultaneously. Conclusions: a substantial lack of evidence regarding HRQOL determinants was demonstrated. This strongly weakens the reliability of the reported findings due to the challenging presence of baseline confounding, selection, and omitted variable biases. The proposed approach recommends the use of further evaluation tools that gather more variables in a single score together with a selection of more homogeneous, reproducible, and comparable cohorts based on the identified baseline confounding.


Introduction
Patient-reported outcomes (PROs) provide precious information about troubles in everyday life and the perception of psychological and physical wellness from the patient's perspective. Over recent decades, PROs have gained more relevance in treatment decision making, so much so that the U.S. Food and Drug Administration (FDA) and the European Medicines Agency (EMA) consider them-including the quality of life-as a relevant end point to approve new therapies [1][2][3]. To approach such a complex topic as the quality of life in oncological patients, they commonly refer to health-related quality of life (HRQOL). A distinction between these concepts has been made to exclude influences from domains that are not related to the patient's health status [4], at least theoretically.
The concept of "quality of life" was firstly introduced by Heckscher [5], and in 1977 was adopted as a "keyword" by the United States National Library of Medicine [6]. Since then, several definitions have been proposed [7,8]. The WHO defined quality of life as "individuals perceptions of their position in context of the culture and value systems in which they live and in relation to their goal, expectations, standards, and concerns" [9].
Head and neck tumors and their treatment may negatively affect patients' HRQOL, which is considered an essential secondary outcome of treatment nowadays [10,11]. For this reason, having reliable evaluation tests is mandatory to better understand how and why specific medical interventions should be chosen and adapted according to individual needs. The quest towards the perfect quality of life evaluation test led researchers to understand some key points to be focused on: a test should be reproducible, sensitive, and easy to understand [12]. Questionnaires developed by the European Organization for Research and Treatment of Cancer (EORTC) Quality of Life Group are widely used in current literature to address these needs. A core questionnaire (EORTC QLQ-C30) is associated with site-specific validated modules (EORTC QLQ-H&N35/43), consisting of single-and multi-item scales that measure several head and neck symptoms [13,14].
HRQOL is a complex topic and needs to be analyzed taking into account every potential influencing factor. Various sociodemographic, disease-specific, and treatmentspecific aspects have been recognized as affecting HRQOL [12,[15][16][17][18][19][20]. Several researchers have investigated its intrinsic multidimensionality, concluding that HRQOL plays a role in treatment decision making, but none have verified what the relevant items are and how this feature is assessed. The scope of the present review was to highlight possible sources of bias that could be encountered when evaluating HRQOL in patients treated for oral cancer. The second aim was to lay the foundation of a standardized protocol for cohort selection, data collection, and stratification that could enhance knowledge in the field.

Materials and Methods
This study was conducted following recommendations by PRISMA for scoping reviews (see supplementary document Table S1). Description of primary objectives was carried out according to the JBI reviewer's manual [21]: participants = patients treated for oral cancer; concept = HRQOL assessed by EORTC questionnaires; context = not specified).
A systematic search of published literature was performed in PubMed, EMBASE, and Scopus databases without limitations concerning the date of publication (last screening on 2 February 2021), based on the following search query: (oral cancer OR oral cancers OR tongue cancer OR tongue cancers OR mandible cancer OR cancer of floor of the mouth OR cancers of floor of the mouth OR fom cancer OR fom cancers OR palate cancer OR palate cancers OR palatal cancer OR palatal cancers OR cheek cancer OR cheek cancers OR buccal cancer OR buccal cancers OR gingival cancer OR gingival cancers) AND (quality of life OR health-related quality of life OR health related quality of life OR hrqol OR qol) AND eortc.
All results were exported to Endnote™ bibliographic management software (Clar-ivate™, Philadelphia, PA, USA). After duplicates removal, the study design filter was applied according to the inclusion/exclusion criteria reported in Table 1. To minimize potential language selection biases, all non-English language papers were moved to the title and abstract screening phase if at least the abstract was reported in the English language.
Two authors (D.D.C. and C.S.) independently screened retrieved articles by titles and abstracts. Eventual controversies were solved by the intervention of a third author (G.C.). Those papers considered relevant for the topic were selected for full-text reading and independently screened by two authors (D.D.C. and C.S.) following inclusion/exclusion criteria reported in Table 1. Disagreements were solved by a third author (G.C.). The PRISMA search flow diagram reported in Figure 1 summarizes our strategy.

Data Extraction
According to the findings reported in screened studies and previously published reviews [12,16,20], those sociodemographic (SDG) and disease/treatment-specific (DT) variables that have been found to be linked to patients' HRQOL were identified and listed.
The following information was retrieved from included studies: country; study design; characteristics of studied populations, such as sample size; SDG features-gender, age, marital status/family, comorbidity, smoke addiction, alcohol consumption, educational level, employment status; DT features-tumor site, tumor T stage, mandibular resection, extent of resection, surgical approach, neck dissection (ND), reconstruction, neoadjuvant radiotherapy (nRT) and adjuvant radiotherapy (RT), neoadjuvant chemotherapy (nCT) and adjuvant chemotherapy (CT), neoadjuvant chemoradiotherapy (nCRT) and adjuvant chemoradiotherapy (CRT), presence of synchronous lesions at baseline, recurrence or metachronous lesions developed before HRQOL evaluation, major postsurgical complications occurred, secondary surgery required. Additional information was retrieved during the appraisal of the included studies (as well as the use of further scoring systems).

Critical Appraisal
Included studies were evaluated and marked as follows: • "Stratified" for each independent variable related to EORTC QLQ-C30 and/or EORTC QLQ-H&N35/43 *. • "Homogeneous" for each independent variable when all the included cases were equal concerning that specific feature. • "Excluded" or "not present in the sample" for each independent variable if the cases reporting that specific feature were excluded during cohort selection, or if that specific feature was not observed in the screened population. • "Incomplete stratification" for each independent variable related to EORTC QLQ-C30 and/or EORTC QLQ-H&N35/43, in case of uneven or incomplete sample grouping rules. • "Not stratified" for each independent variable that was reported but not related to EORTC QLQ-C30 and/or EORTC QLQ-H&N35/43. • "Not available" for each independent variable that did not clearly describe or was not described in the sample features.
A color-coding system was applied as follows: * Specifically, for tumor site, "stratified" and "not stratified" were replaced by "stratified by oral subsites" and "not stratified by oral subsites", respectively, given that differences were found among tumors located in different oral subsites about their influence on patients' HRQOL.

Results
The initial search yielded a total of 1655 studies. Firstly, 403 duplicated records were removed. Then, in accordance with the applied study design criteria (Table 1), 547 records were excluded (488 conference abstracts, 1 conference review, 47 reviews, 6 books, 2 book chapters, 1 editorial, 2 short surveys). The remaining 705 records were screened by title and abstracts (including 37 non-English language papers), resulting in 223 articles that were considered relevant for the topic and selected for full-text reading. The online search finally yielded 25 articles that met inclusion/exclusion criteria. The screening of grey literature and citations of included studies revealed 5 more relevant papers. Thus, a total of 30 studies was included for the critical appraisal. The search strategy is summarized in the PRISMA flow diagram (Figure 1). Although outside the scope of the adopted study design, reasons for the exclusion after full-text reading are summarized in Figure 2 and extensively reported in the supplementary document Table S2. The most common reason for exclusion was related to the heterogeneity of the studied cohorts (or poor data stratification) regarding the tumor location. According to previously published reviews [12,16,20] and included articles, we identified and drafted 23 potential determinants of HRQOL (see supplementary documents  Tables S3 and S4). Almost all of them were considered as an independent variable for statistical analysis by at least one of the included studies, except for employment status, which was elsewhere advocated to influence HRQOL [22,23].

Study Design
A summary of data design, overall data stratification, and findings of included studies is reported in Table 2. In total, 18 were cohort studies (15 prospective and 3 retrospective), 11 followed a cross-sectional design, and 1 was a case-control study. Of the relevant studies, 27 were conducted on a single-center population, three were multicenter studies (one prospective cohort, one retrospective cohort, and one cross-sectional study). The whole sample of this review comprised 1833 OC cases. Table 2. Study design and independent variables considered for data stratification and findings. Legend to Table 2: ACE-27 = Adult Comorbidity Evaluation 27 score; ADM = acellular dermal matrix; BAMM = buccinator myomucosal flap; BOT = base of tongue; CRT = chemoradiotherapy; CT = chemotherapy; DCIA = deep circumflex iliac artery flap; FFF = free fibula flap; FOM = floor of the mouth; G8 = Geriatric 8 screening tool; HADS = Hospital Anxiety and Depression Scale; HNC = head and neck cancer; KFI = Kaplan-Feinstein index; MRND = modified radical neck dissection; ND = neck dissection; NOS = not otherwise specified; OC = oral cavity; OCC = oral cavity cancer; OP = oropharynx; OOP = oral cavity and oropharynx; OOPC = oral/oropharyngeal cancer; ORFFF = osteofasciocutaneous radial forearm free flap; OSCC = oral squamous cell carcinoma; PMMC = pectoralis major myocutaneous flap; RFFF = radial forearm free flap; RT = radiotherapy; SCAIF = supraclavicular artery island flap; SCC = squamous cell carcinoma; SND = selective neck dissection; STSG = split thickness skin graft.

Sociodemographic Variables (SDG)
A summary of data stratification by SDG variables is reported in Table 3 (for further  features see supplementary document Table S3). In total, 8 of the 23 selected variables were related to SDG aspects. None of the included articles considered all SDG variables simultaneously during cohort selection or for data analysis.
Age was reported by 28 articles; data stratification was properly performed by six [9,26,27,46,49] and inadequately by two (which did not report age thresholds) [43,48]. One study investigated a homogeneous population for this variable [36].
Marital status/family was reported by six articles and data stratification was properly performed by four [25,27,33,43].
Comorbidity status was reported by seven articles, of which, data stratification was properly performed by four papers [25,26,33,46]. One study excluded patients affected by severe comorbidity status [27].
Smoking was reported by seven articles and data stratification was performed by one [27].
Alcohol consumption was reported by four articles and data stratification was performed by two [27].
Educational level was reported by four articles and data stratification was performed by two [27,33].
Employment status/annual income was reported by three articles and data stratification was performed by one [33]. Table 3 (for further information see supplementary document Table S4). In total, 15 of the 23 selected variables were disease-and treatment-related aspects, seven of which were linked to surgical procedures (see methods paragraph).

Summary of data stratification by DT variables is reported in
None of the included articles considered all DT variables simultaneously during cohort selection or data analysis.
Data from the included studies were adequately stratified by involved oral subsites in three papers [37,46,48] and incompletely/inadequately in five (which customarily grouped different oral subsites) [9,23,30,36,49]. Investigations performed by five studies were on homogeneous populations regarding this variable: on mobile tongue cancers in three [28,31,45], on lower lip cancers in one [38], and on buccal mucosa cancers in another [39].
Of the included studies, two were conducted on patients who had undergone medical treatments without surgery [45,49], thus marked as "not present" (NP) compared to all the surgery-related DT variables. The only exception was the study of Petruson et al. [45], which was marked as "not available" for "required secondary surgery" since the authors did not clearly define whether a part of the studied sample underwent a secondary surgery after definitive medical treatment.
Performed mandibular resection was overtly reported in 15 articles-data stratification was properly performed in three [23,36,46]; incomplete/inadequate stratification was performed in one (which compared no mandibular resection group to patients undergoing mandibular resection grouping together with those who received marginal and segmental resections) [24]; six studies clearly stated that none of the included cases underwent mandibular resection [28,31,38,39,45,49]; and in three studies, the investigated population homogeneously underwent segmental mandibular resection [29,40,48].
The extent of surgical resection was considered "stratified" only in those cases where the resected oral subsites were clearly identified. This variable was indicated in seven articles-according to this definition, none performed stratifications. Data from one study were considered incompletely/inadequately stratified due to the reported horizontal defect size (which partially defined the extent of surgical resection) [48]. In two studies, the investigated population homogeneously underwent the same resection: partial glossectomy in one [28] and partial pelviglossectomy in the other [31].
The surgical approach was indicated in seven articles; data stratification was properly performed in one [31]. In three studies, the investigated population homogeneously underwent transoral surgery [28,38,39].
The performed ND was indicated in 10 articles-data stratification was properly performed in one (it means that different standardized procedures [52] were separately investigated) [37] and incomplete/inadequate stratification was performed in six (mostly because the type of ND were not specified) [10,23,28,36,39,48].
The performed reconstruction was reported in 23 articles-data stratification was properly performed in five (means that each investigated reconstruction strategy-i.e., each type of free flap, each type of regional flap, each type of local flap, primary closure, and each type of graft was investigated separately from each other) [28,29,32,38,40], incomplete/inadequate stratification was performed in in seven [23,24,31,35,37,46,48], and the investigated populations homogeneously underwent the same reconstruction strategy in four studies (radial forearm free flap) [22,25,26,39].
The performed nRT was reported in 12 articles-data stratification was properly performed by one [35] and incomplete/inadequate stratification was performed by two (means that the authors did not define whether the radiotherapy was performed before or after surgery) [32,33]. In 1 study the investigated population homogeneously underwent nCRT [36]; 5 studies stated that none of the included cases underwent nRT [31,34,39,45,49].
The performed nCT was reported in eight articles-data stratification was properly performed by one [35]; incomplete/inadequate stratification was performed by another one (it means that authors did not define whether the radiotherapy was performed before or after surgery) [33]; in one study, the investigated population homogeneously underwent nCRT [36]; and five articles stated that none of the included cases underwent nCT [31,34,39,45,49].
The performed CT (both adjuvant or definitive) was reported in 13 articles-data stratification was properly performed by four [30,34,37,39]; incomplete/inadequate stratification was performed by one (it means that authors did not define whether radiotherapy was performed before or after surgery) [33]; in three studies, the investigated population homogeneously underwent RT or CRT (both adjuvant or definitive) [28,45,49]; and one article overtly stated that none of the included cases underwent CT [35].
The presence or the absence of patients with synchronous lesions at baseline in the studied sample was overtly indicated in four articles-in one paper, those patients who presented synchronous lesions were excluded a priori [28], while the authors in two studies stated that these patients were not present in the studied population [31,49]. Table 3. Summary of data stratification by DT and SDG variables. Legend to Table 3   The presence or the absence of patients who developed metachronous neoplasms or disease relapse in the studied sample were indicated in 17 articles-data stratification was properly performed in two [39,46]; in eight papers, those patients who developed a relapse or a metachronous lesion were excluded from data analysis [24,26,34,36,42,48,50,53]; and the authors in another study overtly stated that these patients were not present in the studied population [31].
The presence or the absence of patients who experienced major post-surgical complications in the studied sample was reported in nine articles-data stratification was properly performed in one [32]; incomplete/inadequate stratification was performed in two (it means that an uneven definition of this variable was reported-e.g., partial and total flap loss not distinguished, major surgical complications NOS) [26,46]; in another paper, these patients were excluded from data analysis [48]; and the authors from three studies clearly stated no major post-surgical complications were observed in the investigated sample [28,31,38].
Patients who required secondary surgery for tumor relapse or reported major postsurgical complications were included in six articles-none performed data stratification regarding this variable; in one study, these patients were excluded from the data analysis [24]; and the authors from another study stated that no secondary surgery was performed in the investigated sample [31].

Descriptive Analysis
RT and gender were the most frequently considered among DT and SDG variables, respectively, followed by mandibular resection and reconstruction in the former group, and by age and comorbidity in the latter (Figures 3 and 4). Results also showed that these studies focused on the exclusion of patients who developed recurrences of metachronous lesions.  On average, only 5.0 (SD, 4.0) DT variables were considered by each included study, and 5.1 (SD, 3.8) for each case, as a result in the weighted average. However, these values dropped to 3.7 (SD, 3.8) and 3.8 (SD, 3.7) if just proper analysis, exclusions, and homogeneity were considered (Table 3, Figure 5). On average, only 1.1 (SD, 1.8) SDG variables were considered by each included study, and 1.0 (SD, 1.9) for each case, as a result in the weighted average. Similar values were achieved considering only proper analysis, exclusions, and homogeneity (Table 3, Figure 5).
As mentioned above, surgery-related DT variables were considered as "not present" (NP) for those studies that investigated a non-surgical population [45,49]. Thus, they resulted in two of the most accurate analyses among the included studies ( Figure 6).

Discussion
Although this article was initially designed as a systematic review and a meta-analysis, in our opinion, outcomes would be meaningless due to the inhomogeneity of included studies and biases that might have occurred. As a result, we chose to investigate how closely potential influencing factors were evaluated, to highlight possible sources of bias that could be encountered assessing HRQOL in oral cancer patients.

Gender and Age
Among sociodemographic variables, gender and age were the most investigated ones. Most of the included studies found no differences concerning these variables [9,25,27,46,49,53]. Remarkably, Kovacs et al. [37] reported worse results in males regarding financial difficulties and cognitive and social functioning. This revealed an interesting food for thought, considering that household income derives most commonly from men.
Non-standardized thresholds were considered by investigating the potential influences of age. Moreover, it is noteworthy that during the last decades chronological age has progressively lost its relevance according to the comprehensive geriatric assessment (GCA) approach. An innovative concept of "psychological age" is gaining momentum in the field [54,55] and it was adapted to HNC patients by Pottel et al. [56], assessing the effectiveness of different health status screening tools. They found that Geriatric 8 (G8) represents the index of choice to identify patients in a GCA approach. Among included studies, Bozec et al. [27] performed a stratification using the G8 tool, finding a significant negative correlation between HRQOL and scores lower than 15.

Marital Status and Family
Marital status and family were investigated by four studies [25,27,33,51], all reporting no associations with questionnaires. However, Bozec et al. [27] found a negative correlation by stratifying the results of the EORTC QLQ-ELD14. This finding suggests the existence of covering effects from other variables that might impact QLQ-C30 and H&N35 strongly, hiding possible influences of the marital status and family conditions.

Comorbidity
Only a minority of the included studies investigated the influence of comorbidity on HRQOL. No correlations were found by three studies [25,33,46], while results from Bozec et al. [26] were retrieved from the analysis of questionnaires taken 6 months after surgery. As reported in the inclusion/exclusion criteria, these suggestions were not taken into account, since a great variability in HRQOL scores was reported in the literature during the first year after treatment.
It must be noted that different methods (even non-standardized and non-validated) were used to assess patients' comorbidity status. In our opinion, it is strongly preferable to use one of the several scoring systems and scales widely adopted elsewhere in the literature, as well as the Kaplan-Feinstein Index (KFI)-which was developed to evaluate comorbidities in diabetes mellitus [57] and subsequently modified and validated by Piccirillo [58]-or the Adult Comorbidity Evaluation 27 (ACE-27) [59]-which also includes alcohol abuse.

Alcohol, Smoke, and Educational Level
The effects of alcohol consumption and smoking were investigated only by Bozec et al. [27], who found no correlations with HRQOL conversely to other findings reported in the literature [60,61]. To clarify the roles of smoking and alcohol consumption in determining HRQOL, the comparison to some control groups composed of teetotalers and non-smokers should be required. Unfortunately, it would be extremely challenging to obtain adequate sample sizes to allow them to be reliably compared. Smoking and alcohol intake are the main risk factors for the development of OCC [62].
The correlations between HRQOL and educational level were analyzed by Bozec et al. [27] and Huang et al. [33], both retrieving no associations.
Interesting results would be expected from an investigation into educational level in larger cohorts. In this regard, it might be preferred to achieve standardized subgroups by using validated evaluation tools, as well as the International Standard Classification of Education (ISCED) [22].

Cancer Site
Although HRQOL in HNC patients has gained great relevance during recent decades, most published studies still have not considered that cancer site might have a significant impact [34,37,42,46,50,63]. Indeed, the most common reason for exclusion in the screened articles was directly related to this aspect ( Figure 2). This potential source of bias is scantly contemplated concerning the HNC regions (e.g., oral cavity, oropharynx, larynx, etc.), and much less considering oral subsites.
Interestingly, findings reported by Kovacs et al. [37] demonstrated that cancers arising from different oral subsites differently affect HRQOL, while Pierre et al. [46] and van Gemert et al. [48] found no significant variations. Such controversies should be addressed by analyzing larger samples that allow performing a more reliable data stratification. At the same time, it must be highlighted that the cohort selection would overcome this issue by including more homogeneous cases, as performed by some included articles [28,31,38,39,45].

Cancer Stage
Most likely, the cancer stage represents one of the most challenging variables to correlate with HRQOL, since the multitude of baseline confounding must be considered. For example, compared to early-stage cancers, advanced stages require more frequently adjuvant therapies and they need more extensive surgeries, which may include a mandibular resection, implying more demanding reconstruction strategies. The appraisal of findings reported by Beck-Broichsitter et al. [24] and Becker et al. [23] provided a clear demonstration of possible controversies that could be encountered due to some omitted variable biases. The authors compared the same T-stage subgroups (Tis-2 vs. T3/4) and one found no significant differences, while the other reported worse results for almost all questionnaire items in advanced-stage cancers. Controversies like this are repeatedly presented in the screened papers, some reporting no differences [25,27,36,49], others reporting substantial ones [9,33,34,46].
In our opinion, the cancer stage could be considered in a wider context, including almost all baseline confounding. The only exception is represented by those middle-stage cancers that could or could not be eligible for adjuvant therapies based on clinical and histological features. Future studies will provide adequate piece of evidence to reliably correlate these variables.

Mandibular Resection
Although it has been previously stressed that mandibular resection strongly impairs patients' HRQOL [16,64], the generic findings of included studies are inconsistent. Becker et al. [23] were the sole researchers reporting worse results in patients undergoing mandibular resection compared to those who did not. A mandibular resection group was also studied distinguishing marginal from segmental resections. Unsurprisingly, the former demonstrated better questionnaire results.
Like most of the selected variables, the controversies observed among included articles suggest the existence of baseline confounding. We suppose that the need for adjuvant therapies (particularly the RT), the reconstruction, the cancer stage, and the extent of surgical resection could be the most probable sources of bias, since the mandible involvement is commonly associated with advanced cancer stages.

Extent of Resection
Van Gemert et al. [48] were the sole researchers who stratified the studied sample according to the extent of resection (specifically in the horizontal size). Conversely to what was documented elsewhere [16,65], they reported minimal differences. Since the current knowledge in reconstructive techniques allows surgeons to adequately restore even complex and extended defects, the authors suggest that accurate and successful reconstructions could justify these findings. We agree with this hypothesis, despite the fact that surgical complications and secondary surgery must be excluded or carefully examined during data analysis to ensure the absence of possible omitted variable biases. Thus, influences from the aforementioned baseline confounding (see cancer stage paragraph) should be considered.

Surgical Approach
Within the included studies, the impact of the surgical approach on HRQOL was supposed to explain some of the findings. Ferri et al. [31] were the only ones who considered this variable for data analysis. They compared two different treatment protocols: transoral partial pelviglossectomy followed by a buccinator artery myomucosal flap versus a pullthrough partial pelviglossectomy followed by various free flaps. Significantly better results were reported in the former group.
The comparison of different surgical protocols implies taking into account some baseline confounding. For example, the pull-through resection involves various deep structures of the mouth floor that can more likely be restored by using free flaps [66], as clearly recognized by the authors. The cancer stage, adjuvant therapies, and the extent of resection also represent possible baseline confounding variables, since the cancer extent might force the surgeon to choose more invasive surgical approaches.
As reported elsewhere in the literature, the surgical approach seems to impact HRQOL in treated patients. Although disease-free survival still represents the primary outcome, minimally invasive approaches should be considered whenever it is possible, in order to reduce post-operative morbidity [67][68][69][70][71].

Neck Dissection
Some contradictory results were retrieved from the included studies concerning the ND as an HRQOL determinant. Kovacs et al. [37] described progressively worse results comparing patients who did not receive ND to those treated by selective ND (lev. I-II) and those by type III modified radical ND. We agree with the authors' opinion about the possibility of baseline confounding since patients undergoing ND most frequently even underwent adjuvant RT. Future studies comparing patients receiving RT/CRT only and those treated by surgery with neck dissection and adjuvant RT/CRT will probably clarify these doubts.

Reconstruction
Unsurprisingly, reconstruction was the most investigated among surgery-related variables. It is commonly believed that the quality of reconstruction is strictly associated with patients' functional and aesthetic outcomes and post-treatment HRQOL [72,73]. Knowledge in reconstructive surgery has been taking great strides forward since free flaps were introduced for the restoration of head and neck defects [72][73][74]. Performing a systematic review of the literature on reconstructive strategies in patients not eligible for free flaps, we surprisingly highlighted a growing interest toward more conservative solutions over the last few years [75][76][77].
Despite the huge literature, reconstruction still raises disputes about which surgical reconstructive protocol is the best to restore oral defects [78][79][80][81][82][83]. Similarly, the findings reported by included studies showed widely controversial results. In this regard, it should be noticed that huge differences within the studied populations do not permit a reliable comparison of the observed outcomes. In our opinion, the evaluation of the impact of recon-structive procedures on HRQOL implies several risks of bias that must be considered. For instance, careful attention should be paid to patients who developed surgical complications, by excluding them or by performing an accurate sample stratification. Furthermore, the related complications may heavily impair the functional outcome, requiring a much longer recovery time, long-term rehabilitation programs, or even secondary surgery. Moreover, according to the chosen procedure, free flaps may lead to various donor site morbidity [84,85]. All these aspects should be considered for their potential effects on HRQOL.
Reconstruction strategies are mainly chosen according to the defect size and composition: small to moderate simple defects may benefit from reduced donor site morbidity by performing local flaps, while large and/or composite defects need free or regional flaps to be restored [75,86]. Therefore, the evaluation of the reconstruction as an HRQOL determinant should consider some baseline confounding variables, such as the cancer stage, the extent of resection, the mandibular involvement, and the adjuvant therapies. None of the included studies considered simultaneously all these independent variables during the data analysis. Conversely, many of them investigated various reconstructive procedures grouping different flaps together. In our opinion, a reliable comparison should firstly consider the studied flaps separately to minimize evitable biases.

Radiotherapy and Chemotherapy
Almost all the included studies agreed about the deteriorating effects of radiotherapy on HRQOL. Kovacs et al. [37] performed an accurate study comparing patients who received adjuvant RT, adjuvant CT, or adjuvant CRT. Interestingly, there were no significant differences between adjuvant RT and adjuvant CRT groups, which both demonstrated significant worse results compared to patients who did not undergo post-surgical therapies.
Some symptom-related items were found to be particularly affected: dry mouth, sticky saliva, and mouth opening were almost always impaired. These findings were in line with those already widely reported in the published literature [87][88][89][90][91][92][93].
The evaluation of HRQOL demonstrated less interest in studying the effects of neoadjuvant therapies and adjuvant CT alone. This could be attributed to the uncommon use of these treatment protocols in HNC and it would be interesting to investigate the existence of different influences on HRQOL between neoadjuvant therapies and post-surgical ones.
Further compelling aspects derive from the adopted RT technique. The accurate analysis performed by Huang et al. [33] underlined that the most recent 3D radiotherapy (3DRT) and the intensity-modulated radiotherapy (IMRT) result in a better impact on patients' HRQOL, as largely accepted in the current literature [94,95]. Nevertheless, most included studies did not specify which techniques were used in the studied samples, producing a relevant source of bias.
As mentioned above, adjuvant therapies suffer from several baseline confounding factors that should be always considered during the data appraisal. Nonetheless, the trends in the reported findings overtly suggest that it can be considered as one of the main HRQOL influencing factors.

Synchronous Lesions, Recurrences, and Metachronous Lesions
Although rare, the presence of synchronous lesions in the oral cavity inevitably requires larger resective surgeries that negatively influence the HRQOL, but only three studies clearly excluded these patients [28,31,49].
On the other hand, it might appear obvious that a recurrence of previously treated tumors or the development of further cancers may strongly impair HRQOL, especially by affecting psychological status and symptoms [96][97][98]. Nevertheless, only 12 of the included papers considered this aspect during cohort selection [24,26,31,[33][34][35][36]39,42,46,48,49]. Mair et al. [39] were the only ones who conducted an analysis to compare disease-free patients to those who developed a recurrence. Their results strongly support the initial hypothesis, but we should make a point to note the potential sources of bias that might be encountered.
Indeed, progression-free survival strongly depends on the cancer stage, which also reflects the invasiveness of the adopted treatment.

Major Surgical Complications and Secondary Surgery
Only a minority of the included studies considered these variables. Girod et al. [32] investigated the differences between the reconstruction of OC defects by using split thickness skin graft and acellular dermal matrix. They stratified the results by surgical complications, distinguishing patients who experienced a graft failure from those with regular healing. No significant differences were found, but the small sample size and the missing stratification by other variables might have affected their results.
It is reasonable to believe that post-surgical complications and recurrences may impact the HRQOL. In our opinion, this might be related to the resulting functional and aesthetic impairments or to the need for secondary surgery, which may impair the psychological status and the symptoms [96][97][98]. The included studies did not investigate this relation and it could be an interesting food of thoughts for future studies.

Other Variables
The increasing knowledge in multidisciplinary management of oncological patients strongly highlights the relevance of the psychological status [99]. The HRQOL is considered useful not only to evaluate the quality of care interventions from the patient's perspective but also to adjust clinical decision making by evaluating patients' needs and additional interventions, such as psychological counselling [100]. The close relationship between psychological status and HRQOL was demonstrated to predict the quality of life in patients treated for HNC [101].
Expressions of poor psychological status were investigated among the included studies. Moubayed et al. [40] and Bozec et al. [27] observed a negative correlation between depression and HRQOL, as measured by using the Hospital Anxiety and Depression Scale (HADS), while Airoldi et al. [22] supported this observation after evaluating associations with the Dische Scale. In our opinion, obtaining information on patients' psychological status is mandatory to avoid biases that could impair the reported observations. Stratifying results by using validated and standardized indexing systems could address this issue.
Dental restoration represents one of the most interesting fields in searching for treatment-related aspects that could improve the HRQOL in OC patients. Usually, dental status has been already impaired at baseline and not only in those who suffered from cancers involving the jaws. The dental prosthetic restoration (supported or not by implants) could be a deeply influencing factor in patients' everyday life and HRQOL. The recovery of dental occlusion and a balanced mastication has been demonstrated to influence aesthetic outcomes, social parameters, swallowing and cognitive functions [102][103][104][105][106][107][108][109][110][111][112]. The published literature expressed highly significantly better results in patients undergoing micro-vascular mandibular reconstruction (mostly by using free fibula flap) with following implant-supported dental prosthetic rehabilitation compared to non-rehabilitated patients [108][109][110][111][112]. Unfortunately, most of these articles included non-oncological patients within the investigated population and did not meet the inclusion/exclusion criteria.
A potential limitation of this review could be the exclusion of papers that used other evaluation tools. We chose to select only those studies based on EORTC questionnaires because of the comprehensive insight given by the assessment of general (by the QLQ-C30 module) and specific (by the QLQ-H&N35/43 modules) features, addressing the widespread use of these questionnaires. Further studies could provide a comparison with other tools.

Conclusions and Recommendations for Future Studies
The number of controversies found in the current literature demonstrates a substantial lack of evidence regarding HRQOL determinants in HNC patients. Therefore, none of the potential influencing variables should be excluded from data analysis based on the authors' opinion only.
Currently, many of the published articles considered a minority of potential determinants. The data analysis is commonly performed on the basis of each independent variable individually. By approaching such a complex and multidimensional aspect as the HRQOL in this way, the reliability of the reported findings might be strongly weakened due to several selection and omitted variable biases that could be encountered. Since the EORTC Quality of Life Group was founded in 1980, a standardized guideline for cohort selection is still lacking. Thus, the crucial task to avoid the described biases is charged to examiners' knowledge only.
We strongly believe that almost all the identified determinants should be investigated. This implies that much larger samples and much more data must be collected. At the same time, particular attention should be paid to cohort selection to achieve better comparability among the studies. This scope will probably be attained by creating a shared and standardized online data set.
Considering the complex net of baseline confounding highlighted in this manuscript, a suitable strategy could be the use of further evaluation tools, scales, and indexes that condenses many variables in a single score. In our opinion, the benefits from this approach are twofold: a simplification of data analysis and a minimization of omitted variable biases. In this regard, an interesting investigation was performed by Tribius et al. [113] regarding the influence of sociodemographic variables on HRQOL in HNC patients. This study used an adapted version of a composite social class indicator [114] that considered three different sociodemographic variables (educational level, type of occupation, and household income) to differentiate the socio-economic status as high, moderate, or low. Other examples were reported within the discussion of this review (G8, ACE-27, KFI, HADS), but those were related to sociodemographic and psychological variables. To the best of our knowledge, no scoring systems that condense the selected DT-specific variables have been developed yet. Our recommendation for future research is to consider these features simultaneously, rather than individually, addressing the baseline confounding described above, and to select cohorts that are as homogeneous as possible. An example of this protocol is given by Ferri et al. [31] and Canis et al. [28], who performed some accurate cohort selections resulting in quite a small sample size, but one that was highly homogeneous and reproducible.
As observed by Borggreven et al. [25], patients usually present compromised HRQOL at the baseline, probably due to preexisting impairments related to comorbidity status or cancer diagnosis. We believe that this issue could be addressed by evaluating only the differences between baseline and post-treatment questionnaires in a longitudinal study design, rather than in absolute scores compared to a reference population in a crosssectional fashion, even though the interquestionnaire analysis may highlight interesting insights [44].
As a result of this approach, more homogeneous, reproducible, and comparable cohorts will be expected, enhancing the level of evidence in the field.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author.