Reference Values of the QOLIBRI from General Population Samples in the United Kingdom and The Netherlands

The Quality of Life after Traumatic Brain Injury (QOLIBRI) instrument is an internationally validated patient-reported outcome measure for assessing disease-specific health-related quality of life (HRQoL) in individuals after traumatic brain injury (TBI). However, no reference values for general populations are available yet for use in clinical practice and research in the field of TBI. The aim of the present study was, therefore, to establish these reference values for the United Kingdom (UK) and the Netherlands (NL). For this purpose, an online survey with a reworded version of the QOLIBRI for general populations was used to collect data on 4403 individuals in the UK and 3399 in the NL. This QOLIBRI version was validated by inspecting descriptive statistics, psychometric criteria, and comparability of the translations to the original version. In particular, measurement invariance (MI) was tested to examine whether the items of the instrument were understood in the same way by different individuals in the general population samples and in the TBI sample across the two countries, which is necessary in order to establish reference values. In the general population samples, the reworded QOLIBRI displayed good psychometric properties, including MI across countries and in the non-TBI and TBI samples. Therefore, differences in the QOLIBRI scores can be attributed to real differences in HRQoL. Individuals with and without a chronic health condition did differ significantly, with the latter reporting lower HRQoL. In conclusion, we provided reference values for healthy individuals and individuals with at least one chronic condition from general population samples in the UK and the NL. These can be used in the interpretation of disease-specific HRQoL assessments after TBI applying the QOLIBRI on the individual level in clinical as well as research contexts.


Introduction
Traumatic brain injury (TBI) is often a source of long-lasting impairments and functional limitations [1]. It can affect participation in daily activities [2] and may lead to a stagnation in working life for several years [3] or permanently prevent a return to work [4]. TBI can have dramatic consequences for cognitive, behavioral, and emotional life domains, and increases the risk of experiencing other health-related problems such as increased alcohol consumption and depression [5]. However, a person's perception of TBI sequelae, compared to an objectively assessed functional state, is a subjective dimension, and the relationship between these two types of measurement is not always straightforward [6]. Subjective assessments of health deficits and self-rated health-related quality of life (HRQoL) provide valuable additional information to clinical health examinations and ratings. Thus, patient-reported outcomes (PROs) have now become widely used in assessing HRQoL in the field of TBI. HRQoL measures provide aggregated information on diverse health components, such as physical, psychological (mental and emotional), social and daily life aspects, and are, therefore, able to capture the multidimensionality of individually experienced consequences of TBI [7].
A systematic review of assessments of HRQoL after TBI, covering the period from 1991 to 2013, found that the most frequently used instruments were the generic Short Form (36) Health Survey (SF-36) [8] and the TBI-specific Quality of Life after Traumatic Brain Injury (QOLIBRI) [1]. Both instruments display satisfactory to very good psychometric properties in TBI populations, with the QOLIBRI having higher discriminative powers when separate domains of the QOLIBRI and SF-36 are compared [7,9].
To gain a more in-depth understanding of TBI-specific consequences, one may apply a TBIspecific HRQoL instrument. However, from the perspective of rehabilitation after TBI, applying generic instruments may offer an advantage due to the availability of population-based reference values. Bearing in mind the unspecific nature of some post-TBI symptoms, such as headaches and nausea [10], a comparison with general population samples is essential in order to evaluate the rehabilitation progress. Additionally, population-based reference values play a key role in differentiating between individuals after TBI with and without impaired HRQoL.
In previous research, the QOLIBRI was developed and validated exclusively in samples of individuals after TBI to establish its sensitivity for the TBI condition [11]. In the interest of enhancing the interpretability of its scores in clinical practice and research after TBI, we collected QOLIBRI scores from general population samples in the UK and the NL to provide respective reference values.
Thus, the aims of the present study are: • To ensure the comparability of QOLIBRI translations between general and TBI samples by determining the measurement invariance (MI) in general population samples (healthy individuals and individuals with a chronic health condition) and TBI samples from the UK and the NL. To provide reference values for healthy individuals and individuals with at least one chronic health condition from the UK and the NL.
Only when MI has been verified, reference values will be provided for healthy individuals (and individuals with a chronic health condition) from Dutch and UK general population samples.
Separate reference values will be given for the presence and absence of chronic health conditions, age, sex, and level of education.

Study Design
The present study is a web-based, self-reported, cross-sectional study based on quota sampling of general population samples from the UK and the NL (see below). Additional data of patients after TBI, needed for the MI analyses, were retrieved from the multicenter, prospective, longitudinal, observational Collaborative European Neuro Trauma Effectiveness Research in Traumatic Brain Injury (CENTER-TBI) study [12]. These data were collected at three months post-TBI.

General Population Samples Data Collection
The general population sample data were collected through a web-based survey. Respondents were recruited by a market research agency (https://www.dynata.com/), which distributed the questionnaires and collected the data. The samples were based on existing large internet panels designed to be representative for individuals from the general population from the UK and the NL with regard to age, sex, and education. Data collection was carried out between 29 June and 31 July 2017.
The recruitment integrated several sources, e.g., proprietary loyalty partnerships (members of loyalty programs across travel, entertainment, retail, and other sectors), open recruitment to traditional online panels (e.g., via online banners, online all panels, cable TV advertising, mailings, social media influencers, and other methods), and integrated partnerships with online communities, publishers, and social networks. A broad variety of sources was chosen to reach participants from different social milieus to thereby increase the representativity of the sample.
To avoid a self-selection bias, no specific project details were included in the invitation: participants were invited to "take a survey". Details were disclosed later, after the system had selected the individuals for participation according to the given selection criteria. After completing the survey, participants received an incentive in the form of cash, points, prizes, or sweepstakes from the market research company. Respondents, who were identified by the agency as "speeders" (e.g., who took the survey in less than five minutes), were deleted. The electronic data capture system did not allow missing answers, thus respondents had to answer every question. The recruitment process continued until the required quotas were reached.

Informed Consent
Informed consent for the present survey was obtained by the agency from all those agreeing to complete the online survey. The process is described in the privacy agreement available at https://www.dynata.com/privacy-policy/. Participants were informed on the welcome page of the survey that its aim was a better understanding of the consequences of TBI on patients' lives, that it would take approximately 20 min to complete, and that all responses were confidential and anonymous. Data were anonymized and each participant was assigned a number in the order of questionnaire completion.

Sample Composition
From a total of 11,759 survey participants, 4646 individuals from the UK and 3564 from the NL were included for further analyses. Recruitment was carried out until the required quotas for age, gender, and education had been achieved, which ensured that samples were as comparable as possible to the general populations of the two countries. Nonresponse rates were below 20% (UK: 14.4%, NL: 19.5%). A more detailed analysis of these individuals was not possible due to the recruitment system used.
Prior to the analyses, responses to QOLIBRI items were examined for obvious contradictory response patterns in both general population samples, for example, the choice of the response option "not at all" for all items, meaning that responders were not at all satisfied and at the same time not at all bothered. This indicated that the person had chosen only left-hand side response options, ignoring the item polarity. Due to contradictory response patterns, the data of 243 individuals from the UK and of 165 individuals from the NL general population samples were excluded from further analyses. The individuals included and excluded were compared using chi-square (χ 2 -) tests with Yates correction for nominal variables and independent t-test for continuous variables. In both countries, excluded individuals were predominantly male and younger compared with the total sample (M = 35, SD = 12) and had a middle level of education. In the end, 7802 individuals from the general population (UK: 4403; NL: 3399) were included in the final analyses (see Figure 1).

Data Collection
Individuals after TBI were investigated in the (CENTER-TBI) study [13]. They were recruited between 9 December 2014 and 17 December 2017. The inclusion criteria were a clinical diagnosis of TBI, presentation to hospital within 24 h after the injury, a clinical indication for a computed tomography (CT) scan, and provision of informed consent adhering to local and national requirements. Data were collected applying an electronic case report form (e-CRF, QuesGen Systems Incorporated, Burlingame, CA, USA) either during the hospital visit, in a face-to-face visit, a telephone interview, or by mail combined with a telephone interview. The data were exported from the CENTER-TBI database, Neurobot version 2.0, on 8 November 2018. Further study details can be found elsewhere [12].

Informed Consent
Informed consent was obtained according to local and national requirements for all patients recruited in the Core Dataset of CENTER-TBI and documented in the e-CRF [13].

Sample Composition
Out of the total of 4509 CENTER-TBI core study participants, 554 individuals after TBI from the UK and 936 from the NL participated in the assessments at three months post-TBI and were included in the present study. When there were less than 30% of missing answers per QOLIBRI subscale, scores were calculated by using the prorating method [14]. Of the 1490, 830 individuals did not complete the QOLIBRI at three months.
Chi-square tests with Yates correction for nominal variables and independent t-test for continuous variables showed that participants from the NL had a higher level of education, were mostly female, working or studying, and had predominantly sustained a mild TBI (84% in the NL and 72% in the UK) with a good recovery rated by the Glasgow Coma Scale Extended (GOSE) [15], compared to those who did not complete the QOLIBRI. Analyses of contradictory response patterns did not reveal any peculiarities. No exclusion based on QOLIBRI response patterns was necessary for the TBI sample. A total of 660 individuals (UK: 228, NL: 432) were, therefore, included in the further analyses. For more details on TBI sample attrition, see Figure 2.

General Population Sample
The study on the general population sample was part of the CENTER-TBI study and ethical approval was obtained from the Leids Universitair Centrum-Commissie Medische Ethiek (approval P14.222/NV/nv).

TBI Sample
The CENTER-TBI study (EC grant 602150) was conducted in accordance with all relevant laws of the European Union, which were directly applicable or had a direct effect, and all relevant laws of the countries in which the recruiting sites were located, including but not limited to, the relevant privacy and data protection laws and regulations (the "Privacy Law"), the relevant laws and regulations on the use of human materials, and all relevant guidance relating to clinical studies including, but not limited to, the ICH Harmonised Tripartite Guideline for Good Clinical Practice (CPMP/ICH/135/95, "ICH GCP") and the World Medical Association Declaration of Helsinki entitled "Ethical Principles for Medical Research Involving Human Subjects". Ethical approval was obtained for each recruiting site. The list of sites, ethical committees, approval numbers, and approval dates can be found on the project's website https://www.center-tbi.eu/project/ethical-approval.

Sociodemographic and Health Status Data of the All Samples
All study participants provided information regarding their age, sex, and level of education. Individuals from the general population samples were asked if they had one or more chronic health conditions (asthma, heart disease, stroke, diabetes, back complaints, arthrosis, rheumatism, cancer, memory problems due to a neurological condition like dementia, memory problems due to aging, depression, or other problems). Multiple answers were allowed.
The severity of TBI was rated by attending clinical personnel using the Glasgow Coma Scale (GCS), with values of 3-8 indicating severe, 9-12 moderate, and 13-15 mild TBI [16]. Recovery after TBI was rated using the Glasgow Outcome Scale Extended (GOSE) with scores of 3-4 indicating severe, 5-6 moderate disability, and 7-8 good recovery. Scores of 2 indicate a vegetative state and a score of 1 death [15].

Disease-Specific Health-Related Quality of Life after Brain Injury (QOLIBRI)
HRQoL was assessed administering the TBI-specific QOLIBRI questionnaire, which was developed and validated in accordance with the World Health Organization definition of health [14,17]. It covers six life domains (Cognition, Self, Autonomy and Daily life, Social Relationships, Emotions and Physical Problems). Items contributing to the domains Emotions and Physical problems are negatively worded ("How bothered are you by…?"), the remaining items positively ("How satisfied are you with…?"). Thirty-seven items are rated on a 5-point Likert scale ("Not at all" = 1, "Slightly" = 2, "Moderately" = 3, "Quite" = 4, "Very" = 5) and reverse coding was performed for negatively worded items. The QOLIBRI total score is scaled to vary between 0 (worst possible HRQoL) and 100 (best possible HRQoL) [14].
As not all items were directly applicable to the general population, three items were reworded to remove any reference to a TBI: "How satisfied are you with what you have achieved recently (instead of "since your brain injury")?", "How bothered are you by the effects of any injuries you sustained? (instead of "any other injuries you sustained at the same time as your brain injury")", and "Overall, how bothered are you by the effects of any health problems? (instead of "brain injury")".

Statistical Analyses
The statistical analyses comprised the following steps: (1) examination of the psychometric properties of the QOLIBRI on the item and scale level in the general population; (2) MI analyses between groups of individuals from the TBI and general population samples and between the countries, to ensure that the same concept of HRQoL was being measured; (3) multivariate linear regression analyses, which examined whether country of residence, age, sex, level of education, and the presence of chronic health conditions affected the HRQoL/QOLIBRI total score; (4) based on the regression results, computation of reference values for individuals with and without chronic health conditions for the QOLIBRI total score and subscales with respect to age, sex, and level of education.
Descriptive statistics (mean, standard deviation, response frequencies) were used to describe participants' sociodemographic and health-related data.

Item Characteristics of the QOLIBRI in the General Populations
As the main focus of this study was to provide reference values for the QOLIBRI from general population samples, item properties such as mean, standard deviation, skewness, and ceiling effects are only reported for the general population samples. Items with absolute skewness values between 1.0 and 1.3 were interpreted as moderately skewed and not affecting further analysis [10,18]. Due to the high variation in cut-off values for ceiling effects (15-60%) in the current literature [10,19], we set the cut-off value at 40% (twice as high as by chance, 1/5 = 20%) for the maximum response category "very". Additionally, we checked if there were items with less than 10% of responses in the two lower response categories "not at all" and "slightly".

Scale Characteristics of the QOLIBRI in the General Populations
The scales' internal consistency was determined using Cronbach's alpha, with values between 0.7 and 0.95 indicating good to excellent internal consistency [19]. An item was defined as inconsistent when the corrected item-total correlation coefficient (CITC) exceeded 0.4 [20]. Correlations between the QOLIBRI domains were investigated using Pearson correlation coefficients, with values ranging from 0.36 to 0.67, indicating a moderate linear association [21].

Construct Validity of the QOLIBRI in the General Populations
As a prerequisite for MI testing, construct validity was investigated in the general population samples to ensure the comparability of the reworded and the original QOLIBRI using confirmatory factor analysis (CFA) with the robust weighted least squares estimator (WLSMV, calculated with the lavaan-package in R [22]). Model fit was assessed by means of the scaled chi-square statistics, Comparative Fit Index (CFI), and root mean square error of approximation (RMSEA) with a 90percent confidence interval. As the standard cut-offs for CFI (>0.95) and RMSEA (<0.06) [23,24], indicating good model fit, have not been validated for the WLSMV estimator, and they should be interpreted with caution [25]. To address this issue, we compared fit indices across models with different factorial structures (one common factor, two correlated factors-one containing all positively worded "satisfaction" items, and the other one all negatively worded "bothered" items, and six correlated factors) with higher CFI values and lower RMSEA values indicating a better model.

Measurement Invariance in All Samples
By using modern statistical techniques, such as MI testing, it is possible to verify whether the questionnaire score differences between individuals, e.g., with and without TBI experience, can be attributed to true differences in HRQoL or rather to differences in interpretation of the items and response categories, as well as differences in items difficulty and their importance [26].
Therefore, MI testing in the framework of CFA was applied to examine whether TBI experience and cultural/language differences influenced the comprehension of the QOLIBRI items. First, we examined the influence of the TBI experience on the invariance of model parameters by comparing groups of individuals from the TBI and general population samples separately for each country. To overcome estimation problems due to the large number of estimated parameters and relatively small sizes of the two TBI samples, the QOLIBRI items were dichotomized. The response categories "not at all", "slightly" and "moderately" were coded as 0, and "quite" and "very" as 1. We then investigated the effect of the country by comparing UK and NL general population samples.
The strategy for analyzing ordinary scaled response categories suggested by Wu and Estabrook (2016) was applied, resulting in three steps: testing of the (1) configural, (2) partial, and (3) full invariance model. For more details, see Wu and Easterbrook [27].
For MI analyses, at least N = 200 observations per group are necessary to obtain reliable results [28]. All estimations for invariance testing (WLSMV-estimator, theta-parameterization) were performed within the lavaan-package (version 0.6-3) [22]. For model comparisons, we applied a scaled chi-square difference test with the significance level set to α = 0.05. As this test has been criticized for being very powerful in detecting small, possibly irrelevant effects in large samples [29], in case of invariance violation, we estimated whether the effect had a practical significance for estimating the probability of choosing a particular response category. For example, if the full invariance model (invariant thresholds) had a significantly worse fit than the partial invariance model (noninvariant thresholds), the probabilities of individuals from general population samples choosing a particular response category were estimated in both models, and then compared. If the differences did not exceed 5%, we considered the thresholds to be invariant [30].

Reference Values from General Population-Based Samples
As clinicians may be interested in the subjective health status and HRQoL of a single patient after TBI, population-based reference values were calculated as percentiles. Percentiles indicate the value below which a given percentage of observations falls. Based on this information, one can determine whether the QOLIBRI score of an individual after TBI is below, equal to, or above the value of the reference population. The following percentiles are provided for a patient-level interpretation: 2.5%, 5%, 16%, 30%, 40%, 50%, 60%, 70%, 85%, 95%, and 97.25%. HRQoL is considered to be impaired when scores are one standard deviation below the average of the general population sample [31], which corresponds to the 16%-quantile when the data are assumed to be normally distributed. Examples are given in the results section.
Previous research has shown that 50 to 75 cases for each subgroup can already be sufficient to provide norm values [32]. However, as several factors can influence the required sample size (e.g., which type of norms are provided [33]), we have decided to report reference values when the number of cases was at least N = 100. All analyses were performed in R 3.6.0 [34].

General Population Sample
Study participants (N = 4403 from the UK and N = 3399 from the NL) from the general population samples were analyzed. Individuals without a chronic health condition (UK: 2016; NL: 1572) were differentiated from individuals with chronic health conditions (UK: 2387; NL: 1827; for details, see Table  1). In both countries, up to 55% of individuals from the general population samples indicated that they had at least one chronic health condition, and, in comparison with the TBI samples, significantly more individuals described themselves as being unable to work (UK: 10%, NL: 12.8%).

TBI Sample
The TBI sample contained 660 individuals (N = 228 from the UK and N = 432 from the NL), who had filled in the QOLIBRI at three months post-TBI. The majority of individuals from both TBI samples had experienced a mild TBI (71.9% and 84.1 % in the UK and NL, respectively). In the UK, almost half of all individuals after TBI made a good recovery 48.7% (NL: 66.2%) and 20% were still severely disabled (NL: 8.8%) at three months post-TBI. Sociodemographic and health-related data for all samples are presented in Table 1. Housekeeper: general population sample: looking after others, e.g., kids or parents; Education level: TBI-sample: "low": currently in school and primary school, "middle": currently in diploma and secondary school/high school and post-high school, "high": college/university.

Comparison of the General Population Samples with TBI Samples
In both countries, significant differences were identified between the general population samples and the TBI samples concerning age, sex, educational level, and work status. In both general population samples, individuals were younger than in the TBI samples (with an average age difference of five years in the UK and of 10 years in the NL) and had a lower male incidence (UK: 48.5% vs. 66.7%, NL: 49.0% vs. 58.6%). The rate of individuals with a high level of education (diploma, secondary/high school, or post-high school) was lower in the general population samples compared with the TBI samples (UK: 34.5% vs. 43%; NL: 25% vs. 31.7). In the UK, the number of working individuals was lower in the general population sample compared with the TBI sample (51.5% vs. 63.6%, respectively), whereas in the NL the general population sample contained more employed individuals compared with the TBI sample (52.3% vs. 46.8%, respectively).

Item Characteristics of the QOLIBRI in the General Population Samples
On a descriptive level, there were some differences between countries concerning the item characteristics: individuals from the UK general population sample scored lower on average but with higher dispersion and mean values varying from 3.0 (satisfaction with sex life) to 4.1 (satisfaction with the ability to find a way around; NL: from 3.5 to 4.2); items were less skewed ((−1; −0.2), NL: (−1.3; −0.3)), and a ceiling effect was observed for only six items, compared with 10 in the NL sample. All items in the UK sample and 22 items in the NL sample had over 10% responses in two adjusted response categories "not at all" and "slightly". For more detailed information, see Appendix A Table A1.

Scale Characteristics of the General Population Samples
The total scale Cronbach's alpha was high in both general population samples (UK: 0.94, NL: 0.96), and per-scale alpha coefficients ranged between 0.86 (Emotions) and 0.95 (Cognition) in the UK general population sample, and between 0.86 and 0.92 in the NL, indicating a very good internal consistency of the scales. Also based on CITC, all items were found to be consistent in both samples. QOLIBRI domains were moderately to highly correlated (UK: 0.39-0.77, NL: 0.46-0.76). For more detailed information, see Appendix A Table A2.

Measurement Invariance
When the general population and TBI samples were compared for each country, the fit of the model with six correlated factors was not negatively affected by constraining equal intercepts, loadings, and residuals across all groups (UK: Δχ 2 (Δdf) = 23.00 (25), p = 0.577, NL: Δ χ 2 (Δdf) = 8.27 (25), p = 0.999). However, assuming equality of thresholds resulted in significantly higher chi-square values, indicating that some thresholds may not be invariant across groups. When UK and NL general population samples were compared, significant, yet very small, and thus, negligible chisquare differences (Δχ 2 (Δdf) = 87.27 (25), p < 0.001) were observed [32]. The model fit deteriorated meaningfully when the thresholds were restricted to be equivalent across groups (Δχ 2 (Δdf) = 2395.26 (148), p < 0.001). For details, see Appendix A Table A4.
More detailed analyses on the estimated thresholds using the partial invariance model showed that the thresholds obtained in the general population sample were significantly higher than those in the TBI sample. Comparing the UK and NL general population samples, the thresholds obtained from the UK sample were lower in all cases (see Appendix A, Figure A1). However, for individuals from the general population samples, differences in the probabilities of choosing a particular response category did not exceed 5% (Appendix A, Table A5). Therefore, the violation of the threshold invariance may be interpreted as not significant. This implies that the QOLBRI scores can be compared between countries, and between the general population and TBI samples. More important, differences in the QOLIBRI scores should be attributed to "real" differences in HRQoL.

Reference Values for the General Population Samples
A significant difference in HRQoL as indicated by the QOLIBRI total score, was found between the countries. The NL sample experienced a significantly higher HRQoL compared with the UK general population sample (β = 8.76, p < 0.001). Regression analyses identified a significant effect of age, level of education, presence of at least one chronic health condition, and interactions between age and sex and health status in both general population samples. No significant effects for sex were found in both general population samples (Table 2). Reference values of the general population-based samples for the QOLIBRI total score are presented in Table 3 for the UK and Table 4 for the NL. The tables with the reference values for the QOLIBRI subscales can be found in the Online Supplementary Materials (Table S1: UK; Table S2: NL).  The following example illustrates how to apply these norms. After a TBI, a 70-year-old woman from the UK without any chronic health condition reports a QOLIBRI total score of 75. The table depicts that around 20% of healthy individuals in her age group reported the same level of HRQoL or a lower HRQoL. In other words, 80% of the reference population experience better HRQoL. Should a chronic health condition be known, 60% of the reference population from her age and health status group report better HRQoL and 40% of the general population with similar conditions experience a better HRQoL than she does.
Based on the 16%-percentile cut-off value, HRQoL is interpreted as impaired for female healthy individuals in the age range of 64-75 years when the QOLIBRI total score is under 69, or under 50 if any chronic health condition is reported. The score of 75 exceeds both cut-off values and can, therefore, be interpreted as indicating that she is not impaired (compared with individuals from the UK general population aged between 65-75 years with and without any chronic health condition).

Discussion
The aim of our study was to enhance the interpretability of disease-specific HRQoL after TBI using the QOLIBRI by establishing reference values from general population samples in the UK and the NL, based on representative quotas with regard to sex, age, and educational level. The representation of these characteristics corresponds to their distribution in the UK and the NL general populations (e.g., see the Organisation for Economic Co-operation and Development (OECD) [35] for sociodemographic characteristics in European countries). In this respect, the data from our general population samples are comparable to the general population of each country. This study is unique, as such general population-based reference values are currently not available for the QOLIBRI.
The results indicated that the reworded QOLIBRI is applicable to general population samples and displays good psychometric properties. Measurement invariance testing demonstrated that for the six HRQoL subdomains, all QOLIBRI items have the same meaning for individuals with and without a TBI experience and in the different countries. Therefore, we conclude that the QOLIBRI scores can be compared across general population samples and TBI samples in the UK and the NL. The differences in the scores have to be explained by "real" differences in HRQoL and not by other factors, such as differences in the understanding of items or response categories. Thus, we were able to establish population-based reference values.
In previous research, individuals from the NL general population reported higher mental summary component scores in the SF-36 in comparison to seven other countries, including five European countries [36]. Lower HRQoL was associated with the presence of chronic health conditions [37]. Our results replicate these findings, with individuals from the NL general population sample reporting significantly higher HRQoL compared to those from the UK. Previous findings concerning the association of HRQoL with age are ambiguous: in the general populations of Norway and Canada, higher age was positively associated with the mental summary component score of the SF-36 and negatively with the physical summary component [38,39]. Our data showed that older individuals from the general population samples from both countries and subsamples with and without chronic health conditions report better HRQoL. Our study did not identify any sex differences in the two countries, with the exception of the subgroups with and without any chronic health conditions in the NL sample. This finding is also comparable to a study assessing the generic HRQoL by means of the SF-36: here only the general health perception scale was sensitive to sex differences, with females reporting lower generic HRQoL [8,40].
Previously, the interpretation of the QOLIBRI total score was facilitated through a cross-walk analysis with the mental component summary score of the SF-36, for which US population-based norms were used [41]. HRQoL was considered to be impaired when scores were one standard deviation below the average of the general population sample [35]. Therefore, QOLIBRI values under 60 indicated impaired disease-specific HRQoL [41]. Our reference values provide a country-adapted basis because they were obtained from general population samples. Here, cut-off values of 56 for the UK and 65 for the NL should be taken to identify impaired TBI-specific HRQoL when comparing individuals after TBI with healthy individuals. As we found significant differences between the countries, we strongly recommend using the respective population-based reference values presented in the current study when the QOLIBRI is applied.

Strengths and Limitations
A strength of the present study is the large size of the general population samples, which allowed for high-powered statistical analyses. The stratification into healthy individuals and those having reported at least one chronic health complaint offers an additional possibility for the interpretation of HRQoL of individuals after TBI.
The representativity of the recruited samples may be questioned. First, the selection of participants was based on different web-based panels. This might have led to different selection biases, even when several platforms were used for recruiting in order to increase the representativity of different groups. Second, no information was available from the survey agency concerning participants who were contacted but did not take part in the survey. In other words, it was not possible to determine how many and which individuals could have potentially participated in the study, as a means of demonstrating a selection bias. Third, answers given in the online survey could be associated with (self-)selection and nonresponse bias [42,43], as some individuals may systematically participate in online surveys.
Yet, the sampling procedure was strictly based on demographic characteristics such as age, sex, and education, and on a very large panel involving individuals from different sources. The quota sampling with respect to age, sex, and level of education corresponded to the distribution in the general populations of the two countries (see OECD statistics [35]). Therefore, the samples seem valid for providing reference values to evaluate the degree of impairment of HRQoL in individuals after TBI.
Another limitation was the lack of precise information concerning previously experienced TBIs in the general population samples. However, the estimated TBI prevalence based on reported ageadjusted hospital discharge rates due to TBI is quite low and reaches 312.7 per 100,000 in the UK and 173.7 per 100,000 in the NL [44]. Thus, the presence of individuals, who experienced TBI in the general population samples, was very unlikely to cause a bias concerning the reference values and evaluation of HRQoL.
The baseline characteristics of the general and the TBI sample displayed differences with regard to sex, age, and education status. However, such differences are unavoidable bearing in mind the two times higher prevalence of TBI among males [45] and increasing TBI incidence in elderly people [46], resulting in differences in work status distribution, and in higher rates of retired individuals in the TBI samples.
Furthermore, the relatively small sizes of the TBI samples required dichotomization of the QOLIBRI response categories for MI testing, which is associated with a loss of information concerning response patterns. However, the TBI sample was only used to ensure the methodological comparability of the QOLIBRI in general population samples by MI analyses. It turned out that the factorial structure and the understanding of the HRQoL construct measured by the QOLIBRI were comparable between the general population samples and the TBI samples in both countries. Thus, the reference values established here are reliable.

Conclusions
This paper aimed to provide a basis for a better understanding of HRQoL after TBI in research and clinical practice. For this purpose, population-based reference values were developed to add value to the interpretation and clinical meaningfulness of QOLIBRI scores of individuals after TBI. Significant differences in the reported levels of HRQoL were found between the UK and the NL general population samples as well as between the TBI and the general population samples. Therefore, we have presented population-based reference values separately for the two countries. We recommend establishing population-based reference values also for other countries in future research, especially for lower-income countries, as these are a key component for understanding therapeutic progress in individual cases and enabling research on HRQoL.
Supplementary Materials: The following are available online at www.mdpi.com/2077-0383/9/7/2100/s1, Table  S1: Reference values for the QOLIBRI Cognition scale (UK), Table S2: Reference values for the QOLIBRI Cognition scale (NL). Acknowledgments: The authors would like to thank all study participants and CENTER-TBI investigators as well as Fabian Bockhop for his support concerning the preparation of the tables. The authors acknowledge support by the Open Access Publication Funds of the Göttingen University.

Conflicts of Interest:
The authors declare no conflicts of interest.   Identification constraints for the invariance models: Configural: item intercepts = 0, residual variances = 1, latent factor means = 0, latent factor variances = 1 Partial: item intercepts = 0, residual variances = 1. Only in the reference group latent factor means = 0 and variances = 1; Full: item intercepts = 0, residual variances = 1. Only in the reference group factor means = 0, factor variances = 1.  Note. a Probabilities estimated for the NL general population sample from the invariance model comparing TBI and general population sample; b Probabilities estimated for the UK general population sample from the invariance model comparing TBI and general population sample; c Probabilities estimated for the NL general population sample from the invariance model comparing UK and NL general population samples; d For measurement invariance testing with TBI samples response categories "not at all" and "slightly" were recorded as 1.