Psychometric Properties of the Short Form-36 (SF-36) in Parents of Children with Mental Illness

Given the stressful experiences of parenting children with mental illness, researchers and health professionals must ensure that the health-related quality of life of these vulnerable parents is measured with sufficient validity and reliability. This study examined the psychometric properties of the SF-36 in parents of children with mental illness. The data come from 99 parents whose children were currently receiving mental health services. The correlated two-factor structure of the SF-36 was replicated. Internal consistencies were robust (α > 0.80) for all but three subscales (General Health, Vitality, Mental Health). Inter-subscale and component correlations were strong. Correlations with parental psychopathology ranged from r = −0.32 to −0.60 for the physical component and r = −0.39 to −0.75 for the mental component. Parents with clinically relevant psychopathology had significantly worse SF-36 scores. SF-36 scores were inversely associated with the number of child diagnoses. The SF-36 showed evidence of validity and reliability as a measure of health-related quality of life in parents of children with mental illness and may be used as a potential outcome in the evaluation of family-centered approaches to care within child psychiatry. Given the relatively small sample size of this study, research should continue to examine its psychometric properties in more diverse samples of caregivers.


Introduction
Parents whose children have a mental illness report compromised health-related quality of life (HRQL) and more caregiver strain or psychological stress [1,2]. Because epidemiological studies show that 20% of children have a mental illness [3,4], there is a large proportion of parents with, or at risk for, poor HRQL. Meta-analytic evidence suggests that parents of children with mental illness have significantly worse HRQL compared to parents of healthy children and population norms [1]. Notably, this effect was moderated by child age-parents of older children with mental illness had worse HRQL compared to parents of younger children with mental illness. Understanding the contextual effects that contribute to compromised HRQL is important, because parents of children with mental illness take leading roles in caregiving, seeking and obtaining professional health services, and supporting the overall functioning of the family [5,6]. There are negative consequences for children with mental illness and families when the HRQL of their parents is compromised [7,8]. Addressing key questions of whether, why, and how parents are affected by caring for a child with mental illness can help direct resources and treatment programming using family-centered care frameworks within child and adolescent psychiatry.
However, given the unique, and often stressful, experiences of parenting children with mental illness, researchers and health professionals must ensure that the HRQL of these vulnerable parents is measured with sufficient validity and reliability. Such methodological work is needed so that meaningful comparisons can be made between this population of parents and parents of healthy children or parents of children with other chronic illnesses, Psych 2022, 4 as well as population norms to determine the extent to which child mental illness impacts parent HRQL.
The Short Form-36 (SF-36) is one of the most commonly used measures of adult HRQL [9] and has a strong track record of use among parents of children with mental illness [1]. However, no studies have assessed its psychometric properties in this population. Its pervasiveness in health research stems from the comprehensiveness of its development to conceptualize health as a multidimensional construct and to measure the range of health states, including well-being and individual perceptions of health. The SF-36 includes 36 items that measure eight health-related concepts or subscales: Physical Functioning (ability to perform physical activities), Role-physical (problems with activities as a result of physical health), Bodily Pain (level of pain/limitations), General Health (perception of overall health), Vitality (feelings of energy and fatigue), Social Functioning (ability to perform social activities), Mental Health (feelings related to internalizing problems), and Role-emotional (problems with activities as a result of mental/emotional health). These subscales define physical and mental summary component scores that accounted for approximately 80% of the variance in the SF-36 in the samples used in its initial development [10][11][12]. Further, these component scores have proved useful in interpreting HRQL [10][11][12]. Substantial evidence supports this two-factor structure of the SF-36, whereby the eight subscales (manifest variables) load onto the physical and mental summary components (latent variables), as well as indicators of its construct validity and reliability (i.e., internal consistency), in the general population [13,14], and in specific subpopulations of adults including pregnant women [15], those with chronic physical illnesses [16] and their caregivers [17], and the elderly [18].
This study assessed the psychometric properties of the SF-36 in a sample of parents whose children have mental illness. Specifically, our objectives were to replicate the twofactor structure of the SF-36 (eight subscales loading onto the two summary components); examine internal consistency of the SF-36 and inter-subscale correlations; and, investigate construct validity of the SF-36 with self-reported measures of psychopathology (depression, anxiety, parental stress) and child diagnoses of mental illness. It was anticipated that a two-factor (physical and mental) correlated structure would be confirmed and that based on our sample size and number of items in each subscale, internal consistency reliabilities of the SF-36 would be "good" (α ≥ 0.70) [19]. We did not expect to find age or sex differences in SF-36 scores in our sample of parents. Furthermore, it was expected that correlations with measures of psychopathology would be statistically significant and at least small in magnitude (r ≥ 0.20) and larger for the mental versus the physical component score. We hypothesized that significantly lower SF-36 scores would be found in parents with clinically relevant symptoms of depression or anxiety and that parent SF-36 scores would be inversely associated with the number of mental illness diagnoses in their children in a dose-response manner.

Sample
This is a secondary data analysis of a previously reported cross-sectional study of children who were currently receiving inpatient or outpatient mental health services at a pediatric hospital in Canada [20]. The inclusion criteria for the study were children who were 4-17 years of age, who spoke and understood English, and had a parent/caregiver with adequate English skills to complete the data collection interview and questionnaires. Children were excluded if their mental state prevented completion of the interview and questionnaires; examples included currently experiencing a psychotic episode or exhibiting violent/aggressive behavior which could put the child or research staff at harm. A total of 100 parents agreed to participate in the study; however, one parent did not complete the study interview. Thus, the sample available for analysis was n = 99.
Research staff worked alongside nurses within the mental health program to identify eligible children. Those interested in participating were referred to research staff who introduced the study to children and sought permission to contact their parents to obtain initial oral consent for participation. Research staff scheduled data collection interviews in a study room at the hospital for inpatients or at our research office for outpatients. Clinic rosters were also used to identify eligible children. These rosters included the contact information for families receiving mental health services who agreed to be contacted about research studies. Telephone contact was made with these parents to confirm eligibility and participation in the study. Interview and questionnaire data were collected electronically using laptops in an effort to minimize missing data.
Children were screened for current mental illness by trained research staff using the Mini International Neuropsychiatric Interview for Children and Adolescents (MINI-KID)-a structured diagnostic interview for children aged < 18 years that is aligned with mental illness according to the Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) and the International Statistical Classification of Diseases and Related Health Problems (ICD-10) [21]. Screening was completed for major depressive episode, generalized anxiety disorder, separation anxiety disorder, phobia (generalized or non-generalized), attention-deficit hyperactivity disorder, and oppositional defiant/conduct disorder, which represent the most common mental illnesses affecting children [3]. The MINI-KID has excellent psychometric properties [22][23][24].
Parents provided informed written consent for themselves and all participating children. Children aged 8-15 years provided informed written assent and children 15-17 years provided informed written consent in conjunction with parental consent. Ethical approval was obtained from the Hamilton Integrated Research Ethics Board.

Measures
The SF-36 measures HRQL in the past four-week period across eight subscales and yields two component scores-physical and mental [9][10][11][12]. The Physical component score comprises the Physical Functioning (10 items), Role-physical (4 items), Bodily Pain (2 items), and General Health (5 items) subscales. The Mental component score comprises the Vitality (4 items), Social Functioning (2 items), Mental Health (5 items), and Role-emotional (3 items) subscales. Each subscale is scored using the Likert method of summated ratings (0-100), with higher scores indicating better HRQL. An additional item (not scored), asks individuals to assess their current health relative to one year prior. Mean scores for each subscale are computed. T scores with a mean of 50 and standard deviation (SD) of 10 are computed for the physical and mental component scores. These T scores are based on the sum of weighted z-transformed subscale scores computed using population norms [25]. Internal consistencies of the SF-36 subscales have been reported to be 0.78-0.93 in the general population [10]. The SF-36 can be viewed at https://www.rand.org/health-care/surveys_tools/mos/36-item-short-form.html (accessed on 18 January 2021).
Symptoms of depression were measured using the Center for Epidemiological Studies Depression scale (CESD) [26]. The scale includes 20 items that assesses depressed affect (7 items), positive affect (4 items), somatic activity (7 items), and interpersonal relations (2 items) during the past seven-day period. A four-point Likert response scale (0-3) is used to measure frequency of symptoms experienced ranging from "rarely or none of the time (less than 1 day)" to "most or all of the time (5-7 days)." Total scores range from 0-60, with higher scores indicating greater symptomatology or impairment. Individuals with CESD scores ≥ 16 are considered to have clinically relevant symptoms of depression [26,27]. The CESD has been validated in parents [28], including parents whose children have a chronic illness [29]. The internal consistency of the CESD in our study was α = 0.86.
Symptoms of anxiety were measured using the State-Trait Anxiety Inventory (STAI), which provides an assessment of how individuals generally feel, as well as their propensity for perceived anxiety [30]. Of the 40 items in the STAI, only the 20 items related to trait anxiety were used in our study. The STAI has been shown to assess both negative (9 items) and positive affect (11 items) [31]. Responses were scored on a four-point Likert scale (1-4) from "Almost never" to "Almost always" and summed. Scores range from 20-80, with higher scores indicating greater levels of anxiety. Individuals with STAI scores ≥ 39 are considered to have clinically relevant symptoms of anxiety [30,32]. The STAI has robust psychometric properties [30] and has been used to measure symptoms of anxiety in parents of children with mental illness [32]. The internal consistency of the STAI in our study was α = 0.72.
Parental stress was measured using the Parental Stress Scale (PSS). The PSS is an 18-item measure that assesses the positive and negative experiences of parenting across the domains of rewards (6 items), satisfaction (3 items), stressors (6 items), and loss of control (3 items) [33]. Response options are based on a five-point Likert scale (1-5) and range from "strongly disagree" to "strongly agree". After adjusting eight reverse-coded items, scores are summed to provide a total composite score ranging from 18-90. Higher scores indicate more symptoms of parental stress. The PSS has undergone psychometric assessment in various populations, and has a track record of use in parents of children with chronic illnesses [34,35]. The internal consistency of the PSS in our study was α = 0.85.

Analysis
We used confirmatory factor analysis using maximum likelihood with robust standard errors along with the Huber-White sandwich estimator to account for the non-normality of manifest variables in modelling the SF-36 factor structure [36]. We defined adequate model fit using the following established thresholds: χ 2 goodness-of-fit p > 0.05; Comparative Fit Index (CFI) ≥ 0.90; Standardized Root Mean Residual (SRMR) ≤ 0.08; and, Root Mean Square Error of Approximation (RMSEA) ≤ 0.06 [37]. Following previous reports [34,[38][39][40], if at least two indices met their respective thresholds, model fit was determined to be adequate. If inadequate fit was found, we reviewed modification indices to identify opportunities to improve the model. On account of the relatively small sample size and aligned with previous research [13][14][15][16][17][18], we modelled the eight SF-36 subscales as manifest variables and conducted Monte Carlo simulations to estimate the statistical power in modelling the two-factor structure of the SF-36 (i.e., the most complex model tested with a sample size to item ratio of 99:10, whereby the 10 items represent the eight subscales loading onto the two summary scores) [41]. We set standardized factor loadings to range from λ = 0.65-0.70 (conservative estimates based on previous reports) and statistical power was estimated to be 1-β = 0.75-0.84.
We estimated correlations among SF-36 subscales and component scores using Pearson coefficients. We also used Pearson coefficients to examine discrepant validity of the SF-36 with parent age and convergent validity with the CESD, STAI, and PSS. To determine if correlations were stronger for the mental component score versus the physical component score, we used the method of variance estimates recovery [42]. Discrepant validity was also tested by comparing SF-36 scores by parent sex (t test). Additional convergent validity was examined by categorizing our sample of parents according to the cut-points on the CESD and STAI for clinically relevant symptoms of depression and anxiety, respectively, and using t tests to compare SF-36 component scores between subgroups. We conducted an analysis of variance to determine if SF-36 component scores were associated with the number of child mental illness diagnoses. All hypothesis tests were two-sided and α = 0.05. Factor analyses were conducted using Mplus 6.11; descriptive statistics, comparative tests, and validation analyses were conducted using SPSS 21.

Sample Characteristics
Parents were, on average, 45.3 (SD 6.7) years of age and the majority were female (n = 84; 85%). Nearly two-thirds (n = 60; 61%) had a partner (i.e., married or common-law) and had completed postsecondary education (n = 65; 66%). Half of parents (n = 49) reported an annual household income of at least CAD 75,000; the median household income according to the 2016 Canadian Census. Immigrants were underrepresented in this sample of parents (n = 13; 13%).
Children had a mean age of 13.9 (3.1; range 8-17) years and 71% (n = 70) were female. Parent physical and mental component scores were not associated with child age or sex. The most common illnesses affecting children were major depression and generalized anxiety (n = 72; 73% each). The remaining diagnoses were: phobia (71%), oppositional defiant (47%), attention-deficit hyperactivity (33%), separation anxiety (32%), and conduct disorder (25%). As typical in child and adolescent psychiatry, comorbidity was prevalent: 15%, 26% and 46% of children had two, three, or four or more mental illnesses. Overall, 38 (38%) of children were recruited from the inpatient setting. While physical component scores were similar between parents of children receiving inpatient vs. outpatient services (40.

Factor Structure
We tested three factor structures of the SF-36 and the fit indices for these models are shown in Table 2. Prior to fitting the factor models, baseline models of the physical and mental components were specified to ensure adequate fit. Both components showed adequate fit to the data, though the physical component model had lower (i.e., better) SRMR (0.06 vs. 0.08) and RMSEA (0.26 vs. 0.33) indices. The one-factor model in which all subscales were loaded onto a single HRQL factor fit the data poorly, with none of the fit indices achieving an adequate fit threshold. The two-factor model which did not specify a correlation between the physical and mental component factors also had inadequate fit to the data; again, none of the fit indices achieved an adequate fit threshold. The two-factor model which specified a correlation between the physical and mental component factors showed adequate fit to the data: CFI = 0.94 and SRMR = 0.07. It also demonstrated significantly better fit according to the χ 2 difference test against the one-factor (χ 2 = 79.1; p ≤ 0.001) and two-factor uncorrelated models (χ 2 = 167.1; p ≤ 0.001).

Internal Consistency
Internal consistency reliabilities of the SF-36 subscales were generally robust (α > 0.80), with the exception of General Health (α = 0.65), Vitality (α = 0.59), and Mental Health (α = 0.64), which were lower that established guidelines indicating good reliability. Internal consistencies of the physical and mental components were very strong (α = 0.91 and α = 0.85, respectively). Relatedly, correlations among subscales of the SF-36 were also robust ( Table 3); correlations that were statistically significant (p < 0.01) ranged from a low of r = 0.31 between General Health and Role-emotional to a high of r = 0.65 between Physical Functioning and Bodily Pain. Non-significant correlations were found between Physical Functioning and Vitality (r = 0.06), Role-emotional (r = 0.23), and Mental Health (r = 0.20). The correlation between Bodily Pain and Vitality was also not significant (r = 0.23). The correlation between the physical and mental components was very strong (r = 0.75).

Construct Validity
The SF-36 demonstrated no significant correlation with parent age for both the physical component (r = −0.08; p = 0.464) and mental component (r = −0.03; p = 0.774) scores. Further, differences between mothers and fathers were not statistically significant-physical: 45 (Table 4). Correlations were all statistically significant (p < 0.001) and generally medium-sized in magnitude. While correlations were larger with the mental component score, the differences were not statistically significant compared to the physical component score. Using established thresholds for the CESD and STAI, we dichotomized our sample of parents as having or not having clinically relevant symptoms of depression or anxiety to determine if SF-36 scores would be different between these groups of parents. As shown in Table 5, parents with elevated CESD or STAI scores had significantly lower physical and mental component scores.  We then tested the extent to which SF-36 component scores in parents were associated with the number of mental illness diagnoses in children (Table 6). Both physical and mental component scores decreased significantly with increases in the number of child diagnoses (F = 7.68; p < 0.001 and F = 6.10; p = 0.001, respectively). Post hoc comparisons showed that parents of children with one mental illness had significantly higher SF-36 physical component scores compared to parents of children with three or four mental illnesses (p = 0.015 and p < 0.001, respectively). Parents of children with two diagnoses had significantly lower physical components scores compared to parents of children with four diagnoses (p = 0.026). With regard to the mental component, parents of children with one mental illness had significantly lower scores compared to parents of children with three or four mental illnesses (p = 0.038 and p = 0.001, respectively). The MINI-KID was not completed for six children and thus are excluded from these analyses. Listed post hoc Scheffé contrasts were significant at p < 0.05.

Discussion
This study aimed to replicate the factor structure of the SF-36, estimate internal consistency and inter-subscale correlations, and describe its construct validity with measures of parent psychopathology and child mental illness. Evidence from this study supported our hypothesis that the SF-36 has a correlated two-factor structure and corroborates a previous large-scale study investigating the factor structure of the SF-36 [14]. This correlation is intuitive as there is ample evidence supporting the reciprocal link between physical and mental health across the life course [43][44][45][46]. Previous studies have tested the fit of additional models that include a four-factor [47], higher-order factor [15], and bifactor [48] solutions. We did not investigate such models for two reasons. First, our sample size lacked the statistical power to reliably estimate such complex models. Second, the two component scores of the SF-36 are most commonly used in research and clinical practice [49] and that use of an overall HRQL score based on the SF-36 still warrants further validation in large-scale population studies [50].
Reliability was very strong for both the physical and mental components and most of the eight subscales. Internal consistencies were considered "moderate" for the General Health and Mental Health given the number of items for each subscale and our sample size [19]. Vitality was considered "unsatisfactory" as a large proportion of its variance, approximately 40%, was due to measurement error. Thus, in the context of parents whose children have mental illness, this subscale has limited predictive utility and should be used with caution. Our hypothesis regarding the correlations among SF-36 subscale and component scores was partially supported. All correlations were in the expected direction and magnitude. This finding provided further evidence supporting the two-factor structure of the SF-36 in this population of parents.
Discriminant validity of the SF-36 was demonstrated with similar physical and mental component scores across age and sex. Despite evidence of age and sex differences in the SF-36 among population norms [25], our null findings were expected. First, variability in the age of our sample of parents was relatively small. Second, reported age differences in population studies are small and do not necessarily reflect clinically relevant differences in HRQL [25,51]. Third, given the context in which we studied our sample-eligible parents were those who were the primary caregiver for their children-comparisons may better reflect gender role, rather than sex. Previous reports have shown that among parents of children with a chronic illness, strain and psychological distress is similar among primary caregiving fathers and primary caregiving mothers [52].
Convergent validity of the SF-36 was also demonstrated. Findings showed that the mental component of the SF-36 taps into symptoms of internalizing disorders better than generic caregiver strain. While stronger correlations were found for the mental versus physical component score, these differences were not significant and reaffirm the physicalmental health link. Similar findings were seen when comparing parents with versus without clinically relevant symptoms of psychopathology and which has been reported previously [53].
Finally, worse HRQL in parents of children with multiple mental illnesses was supported. This inverse dose-response association was linear in nature and likely reflects the challenges experienced by parents caring for a child with multiple illnesses; a common occurrence in child and adolescent psychiatry [4]. The number of comorbidities can also be a proxy for illness severity. Inverse associations between child illness severity and parent HRQL have been reported in samples of children with physical [54] and mental illnesses [55,56].

Study Implications
The findings contribute to extending the validity and reliability of the SF-36 and should provide confidence using this measure in parents of children with mental illness. Replication of its factor structure suggests that the HRQL construct is interpreted in this sample of parents in the same manner in which the SF-36 was initially developed. Thus, it can be used to make comparisons across samples and with population norms. The design of the current study did not allow for the examination of test-retest reliability and predictive validity of the SF-36. Given the uniqueness of this parent population, these important psychometric properties should be evaluated in the future studies. Additional work is needed to demonstrate that the SF-36 has clinical utility as a patient-reported outcome assessing change in HRQL. Quantifying its responsiveness to change and calibrating minimally important differences is necessary prior to implementing the SF-36 as a potential outcome in the evaluation of family-centered approaches to care within child and adolescent psychiatry. Further, given the strong correlation between the mental component score and the CESD (and to a lesser extent, the STAI), it appears as though the same underlying construct, internalizing symptoms, is being measured [57]. In conducting future studies, researchers may consider including the SF-36 to measure both HRQL and parent psychopathology in order to reduce respondent burden.

Limitations
There are notable limitations to this study. First, the relatively small sample size prevented the examination of more complex factor structures that could potentially result in a better fitting model (e.g., bifactor) and underrepresentation of immigrants [58] may limit the generalizability of the findings to this population of parents of children with mental illness. Relatedly, recruitment from a single site may not adequately reflect the broader experience of this population. Second, shared method variance may have resulted in overestimated correlations and associations. Third, the absence of a control group (i.e., parents of children without mental illness) prevented the opportunity to formally test for measurement invariance of the SF-36, further contributing to its validation in this population.

Conclusions
The SF-36 appears to be a valid and reliable tool to measure HRQL among parents of children with mental illness. Researchers and health professionals should continue to study its psychometric properties in larger and more diverse samples with multiple informants. Collaboration among researchers and health professionals is needed to extend the clinical utility of the SF-36 within child and adolescent psychiatry. Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding author. The data are not publicly available due to restrictions imposed by the institutional review board.